Dashboard

Global Summary
Simple
Stats i
Stats
Breakdown of the total cost and elapsed time for generating the instances.
  • Elapsed Time = Console Time (ie. Processing Time + API Calls)
  • Cost = (input tokens * input price) + (output tokens * output price)
Total Cost $14.26
Validation i
Validation
Measures the correctness of the instantiation using the USE check function.
  • Syntax = 1 - (Total Number of syntax errors [use check] / Total Number of lines [instance])
  • Multiplicities = 1 - (Total Number of multiplicities errors [use check] / Total Number of relationships ([instance] !insert))
  • Invariants = 1 - (Total Number of invariants errors [use check] / Total Number of invariants ([model] constraints))
Syntax 100.0%
Multiplicities 100.0%
Invariants 99.8%
Diversity i
Diversity
Measures the variability of the generated instances. Attributes (NumericEquals, StringEquals, StringLv): It identifies how much the LLM repeats specific values versus generating unique data points across instances (100%: Diverse, 0%: Repetitive). We group all generated attributes into bags (numeric and string) and then perform pairwise comparisons between every element to obtain. Structure (GED): Measures the Graph Edit Distance (GED) similarity between instances. Distribution (Shannon): Measures the entropy and evenness (balanced distribution) of the generated enum values.
  • NumericEquals = Total number of numeric attribute pairs with different values / Total number of possible pairs (n * (n - 1) / 2)
  • StringEquals = Total number of string attribute pairs that are NOT exactly identical / Total number of possible pairs (n * (n - 1) / 2)
  • StringLv = Sum of (Levenshtein Distance(a, b) / max(length(a), length(b))) for all string pairs / Total number of possible pairs (n * (n - 1) / 2)
  • GED = Similarity = 1 - (GED / (0.5 * (GED_to_empty_A + GED_to_empty_B))). 1 = red = identical graphs, <=0.5 = green = different graphs. We consider as edit operations: Nodes, Edges, Node_Labels and Edge_Labels [https://github.com/a-coman/ged]
  • Shannon (Active) = Entropy / log2(Number of unique groups actually generated). Measures how evenly the generated values are distributed, considering only the categories the LLM actually used.
  • Shannon (All) = Entropy / log2(Total number of valid groups defined in the model). Measures how evenly the generated values are distributed against the full spectrum of all possible valid options defined in the .use file.
Numeric 98.9%
String Equals 100.0%
String LV 91.2%
GED 0.892 ± 0.077
Shannon (Active) 0.865 ± 0.244
Shannon (All) 0.841 ± 0.255
Coverage i
Model Coverage
Measures the breadth of the instantiation. It answers: "How much of the structural blueprint (the model) was used?"
  • Classes = Total Unique Classes instantiated (!new) in the .soil / Total Number of classes (class) in the model .use
  • Attributes = Total Unique Attributes instantiated (!Class.Attribute or !set) in the .soil / Total Number of attributes (attributes) in the model .use
  • Relationships = Total Unique Relationships instantiated (!insert) in the .soil / Total Number of relationships (association, composition, aggregation) in the model .use
Classes 86.3%
Attributes 88.3%
Relationships 93.8%
Uncovered Items 67
Classes 14
BanquetBusDriverCassetteDietaryRequirementFreeRoomTypesDTOIndividualManagerMatchNote
Show all 14 classes
BanquetBusDriverCassetteDietaryRequirementFreeRoomTypesDTOIndividualManagerMatchNotePersonPlayerNotesReportedAllergySeriesTrainingFailedToAttendVehicle
Attributes 47
Banquet.busServiceBanquet.dateBanquet.groupNameBanquet.nameBanquet.numberBanquet.numberPeopleBanquet.paymentMethodBanquet.phoneNumber
Show all 47 attributes
Banquet.busServiceBanquet.dateBanquet.groupNameBanquet.nameBanquet.numberBanquet.numberPeopleBanquet.paymentMethodBanquet.phoneNumberBanquet.timeBusDriver.dateOfBirthBusDriver.driverLicenseNrBusDriver.nameBusDriver.phoneNumberCassette.availableCopiesCassette.titleDietaryRequirement.dietFreeRoomTypesDTO.numBedsFreeRoomTypesDTO.numFreeRoomsFreeRoomTypesDTO.pricePerNightFreeRoomTypesDTO.roomTypeDescriptionIndividual.dateIndividual.nameIndividual.numberIndividual.numberPeopleIndividual.phoneNumberIndividual.seatingIndividual.smokingIndividual.timeManager.dateOfBirthManager.nameManager.phoneNumberMatchNote.dateMatchNote.notePerson.namePlayerNotes.datePlayerNotes.noteReportedAllergy.allergenSeries.availableCopiesSeries.episodeSeries.titleTrainingFailedToAttend.reasonVehicle.expirationDateVehicle.idVehicle.licensePlateNumberVehicle.registrationLastMaintenanceDateVehicle.registrationStateVehicle.vehicleTypeCode
Relationships 6
BanquetBusDriverFailedPlayerMatchMatchNotePlayerPlayerNotesReservationCustomerTrainingFailded
Instantiation i
Instance Instantiation
Measures the depth or density of the data. It answers: "Of the objects the LLM decided to create, how many of their available 'slots' did it fill?"
  • Classes = Total Number of classes (!new) in the instance / Total possible that could have been instantiated (infinity)
  • Attributes = Total Number of attributes (!Class.Attribute or !set) in the instance / Total possible that could have been instantiated (sum(number of classes instantiated of that type * Class.Attributes))
  • Relationships = Total Number of relationships (!insert) in the instance / Total possible that could have been instantiated (infinity)
Classes 3369/∞
Attributes 9096/9096
Relationships 3978/∞
Quality i
Quality
Measures the realism of the generated instances. It identifies how much the LLM respects real-world logic. Using Gemini 3.1 Pro as an LLM as a Judge we ask it to rate (realistic, unrealistic, doubtful) the realism of each instance and explain its decision.
  • Realism = Total Number of "realistic" instances / Total Number of instances
  • Judge Cost = (input tokens * input price) + (output tokens * output price)
Realism 78.0%
Judge Cost $5.85
CoT
Stats i
Stats
Breakdown of the total cost and elapsed time for generating the instances.
  • Elapsed Time = Console Time (ie. Processing Time + API Calls)
  • Cost = (input tokens * input price) + (output tokens * output price)
Total Cost $58.80
Validation i
Validation
Measures the correctness of the instantiation using the USE check function.
  • Syntax = 1 - (Total Number of syntax errors [use check] / Total Number of lines [instance])
  • Multiplicities = 1 - (Total Number of multiplicities errors [use check] / Total Number of relationships ([instance] !insert))
  • Invariants = 1 - (Total Number of invariants errors [use check] / Total Number of invariants ([model] constraints))
Syntax 99.9%
Multiplicities 99.4%
Invariants 99.2%
Diversity i
Diversity
Measures the variability of the generated instances. Attributes (NumericEquals, StringEquals, StringLv): It identifies how much the LLM repeats specific values versus generating unique data points across instances (100%: Diverse, 0%: Repetitive). We group all generated attributes into bags (numeric and string) and then perform pairwise comparisons between every element to obtain. Structure (GED): Measures the Graph Edit Distance (GED) similarity between instances. Distribution (Shannon): Measures the entropy and evenness (balanced distribution) of the generated enum values.
  • NumericEquals = Total number of numeric attribute pairs with different values / Total number of possible pairs (n * (n - 1) / 2)
  • StringEquals = Total number of string attribute pairs that are NOT exactly identical / Total number of possible pairs (n * (n - 1) / 2)
  • StringLv = Sum of (Levenshtein Distance(a, b) / max(length(a), length(b))) for all string pairs / Total number of possible pairs (n * (n - 1) / 2)
  • GED = Similarity = 1 - (GED / (0.5 * (GED_to_empty_A + GED_to_empty_B))). 1 = red = identical graphs, <=0.5 = green = different graphs. We consider as edit operations: Nodes, Edges, Node_Labels and Edge_Labels [https://github.com/a-coman/ged]
  • Shannon (Active) = Entropy / log2(Number of unique groups actually generated). Measures how evenly the generated values are distributed, considering only the categories the LLM actually used.
  • Shannon (All) = Entropy / log2(Total number of valid groups defined in the model). Measures how evenly the generated values are distributed against the full spectrum of all possible valid options defined in the .use file.
Numeric 98.4%
String Equals 100.0%
String LV 91.3%
GED 0.598 ± 0.115
Shannon (Active) 0.891 ± 0.108
Shannon (All) 0.868 ± 0.154
Coverage i
Model Coverage
Measures the breadth of the instantiation. It answers: "How much of the structural blueprint (the model) was used?"
  • Classes = Total Unique Classes instantiated (!new) in the .soil / Total Number of classes (class) in the model .use
  • Attributes = Total Unique Attributes instantiated (!Class.Attribute or !set) in the .soil / Total Number of attributes (attributes) in the model .use
  • Relationships = Total Unique Relationships instantiated (!insert) in the .soil / Total Number of relationships (association, composition, aggregation) in the model .use
Classes 86.0%
Attributes 83.7%
Relationships 91.7%
Uncovered Items 172
Classes 32
AddressAllergenBanquetBusDriverCassetteChefCommentCompany
Show all 32 classes
AddressAllergenBanquetBusDriverCassetteChefCommentCompanyCompetitionCookDietaryRequirementFoodItemFreeRoomTypesDTOGeoLocationHeadWaiterIndividualItemOrderManagerMatchNoteMatchPlayerPositionMenuItemPersonPlayerNotesRegularCustomerReportedAllergySeriesShipmentTrainingFailedToAttendTrainingObjectiveTruckVehicleWaiter
Attributes 111
Address.textAllergen.typeBanquet.busServiceBanquet.dateBanquet.groupNameBanquet.nameBanquet.numberBanquet.numberPeople
Show all 111 attributes
Address.textAllergen.typeBanquet.busServiceBanquet.dateBanquet.groupNameBanquet.nameBanquet.numberBanquet.numberPeopleBanquet.paymentMethodBanquet.phoneNumberBanquet.timeBusDriver.dateOfBirthBusDriver.driverLicenseNrBusDriver.nameBusDriver.phoneNumberCassette.availableCopiesCassette.titleChef.dateOfBirthChef.nameChef.phoneNumberComment.textCompany.addressCompany.idNumberCompany.nameCompany.poorRiskCompetition.nameCompetition.typeCook.dateOfBirthCook.nameCook.phoneNumberCook.yearsOfExperienceDietaryRequirement.dietFoodItem.descriptionFoodItem.numberFoodItem.purchaseFlagFoodItem.unitFreeRoomTypesDTO.numBedsFreeRoomTypesDTO.numFreeRoomsFreeRoomTypesDTO.pricePerNightFreeRoomTypesDTO.roomTypeDescriptionGeoLocation.latitudeGeoLocation.longitudeHeadWaiter.dateOfBirthHeadWaiter.nameHeadWaiter.phoneNumberIndividual.addressIndividual.dateIndividual.driverLicenseExpirationDateIndividual.driverLicenseNumberIndividual.driverLicenseStateIndividual.homePhoneIndividual.nameIndividual.numberIndividual.numberPeopleIndividual.phoneNumberIndividual.poorRiskIndividual.seatingIndividual.smokingIndividual.timeItemOrder.timeManager.dateOfBirthManager.nameManager.phoneNumberMatchNote.dateMatchNote.noteMatchPlayerPosition.numberMatchPlayerPosition.positionNameMenuItem.classificationMenuItem.descriptionMenuItem.prepTimePerson.emailPerson.namePerson.phonePerson.titlePerson.websitePlayerNotes.datePlayerNotes.noteRegularCustomer.nameRegularCustomer.prefferedLanguageReportedAllergy.allergenSeries.availableCopiesSeries.episodeSeries.titleShipment.idShipment.statusTrainingFailedToAttend.reasonTrainingNotes.dateTrainingObjective.areaToImproveTrainingObjective.endDateTrainingObjective.startDateTrainingObjective.successTruck.expirationDateTruck.gasTankCapacityTruck.idTruck.licensePlateNumberTruck.mileageTruck.odometerReadingTruck.registrationLastMaintenanceDateTruck.registrationStateTruck.vehicleTypeCodeTruck.workingRadioVehicle.expirationDateVehicle.idVehicle.licensePlateNumberVehicle.registrationLastMaintenanceDateVehicle.registrationStateVehicle.vehicleTypeCodeWaiter.dateOfBirthWaiter.nameWaiter.phoneNumberWaiter.spokenLanguage
Relationships 29
AddressContainsGeoLocationBanquetBusDriverBookingBillChefCookCompetitionMatchCustomerConsistsOfShipmentDriverShipmentExpenseComment
Show all 29 relationships
AddressContainsGeoLocationBanquetBusDriverBookingBillChefCookCompetitionMatchCustomerConsistsOfShipmentDriverShipmentExpenseCommentFailedPlayerFoodItemAllergenHeadWaiterWaiterItemOrderMenuItemLocalMatchMatchMatchNoteMatchPlayerMatchPlayerPositionMenuItemChefMenuItemFoodItemPlayerPlayerNotesReservationCustomerReservationItemOrderedReservationWaiterShipmentContainsDeliveryAddressShipmentContainsPickUpAddressStationContainsCustomerStationShipmentTeamTrainingTrainingFaildedTrainingObjectivePlayerVisitorMatch
Instantiation i
Instance Instantiation
Measures the depth or density of the data. It answers: "Of the objects the LLM decided to create, how many of their available 'slots' did it fill?"
  • Classes = Total Number of classes (!new) in the instance / Total possible that could have been instantiated (infinity)
  • Attributes = Total Number of attributes (!Class.Attribute or !set) in the instance / Total possible that could have been instantiated (sum(number of classes instantiated of that type * Class.Attributes))
  • Relationships = Total Number of relationships (!insert) in the instance / Total possible that could have been instantiated (infinity)
Classes 4142/∞
Attributes 10488/10718
Relationships 4943/∞
Quality i
Quality
Measures the realism of the generated instances. It identifies how much the LLM respects real-world logic. Using Gemini 3.1 Pro as an LLM as a Judge we ask it to rate (realistic, unrealistic, doubtful) the realism of each instance and explain its decision.
  • Realism = Total Number of "realistic" instances / Total Number of instances
  • Judge Cost = (input tokens * input price) + (output tokens * output price)
Realism 56.3%
Judge Cost $6.93
Simple - Model Comparison
Name SyntaxMultInvNum.EqStr.EqStr.LVCov.ClsCov.AttrCov.RelInst.ClsInst.AttrInst.RelRealism
Bank 100.0% 100.0% 100.0% 99.3% 100.0% 92.7% 100.0% 100.0% 100.0% 240/∞ 630/630 332/∞ 93.3%
Restaurant 100.0% 100.0% 98.3% 97.8% 99.8% 87.3% 77.0% 86.6% 97.4% 532/∞ 1756/1756 529/∞ 63.3%
AddressBook 100.0% 100.0% 100.0% 100.0% 89.0% 100.0% 100.0% 100.0% 291/∞ 896/896 320/∞ 46.7%
PickupNet 100.0% 100.0% 100.0% 99.2% 100.0% 89.2% 100.0% 100.0% 100.0% 327/∞ 594/594 447/∞ 60.0%
HotelManagement 100.0% 100.0% 100.0% 98.7% 99.8% 77.8% 85.7% 81.8% 100.0% 202/∞ 605/605 202/∞ 96.7%
Football 100.0% 100.0% 100.0% 96.2% 100.0% 87.4% 81.3% 86.5% 77.8% 754/∞ 1838/1838 784/∞ 100.0%
MyExpenses 100.0% 100.0% 100.0% 98.3% 100.0% 88.3% 100.0% 100.0% 100.0% 171/∞ 573/573 189/∞ 100.0%
VideoClub 100.0% 100.0% 100.0% 96.2% 100.0% 83.3% 80.0% 78.3% 100.0% 235/∞ 417/417 216/∞ 100.0%
VehicleRental 99.8% 100.0% 100.0% 99.6% 100.0% 86.6% 83.3% 83.3% 100.0% 210/∞ 1380/1380 300/∞ 33.3%
Statemachine 100.0% 100.0% 100.0% 87.7% 100.0% 84.5% 100.0% 100.0% 100.0% 407/∞ 407/407 659/∞ 86.7%
CoT - Model Comparison
Name SyntaxMultInvNum.EqStr.EqStr.LVCov.ClsCov.AttrCov.RelInst.ClsInst.AttrInst.RelRealism
Bank 100.0% 100.0% 100.0% 98.5% 99.9% 92.1% 100.0% 100.0% 100.0% 240/∞ 636/637 391/∞ 100.0%
Restaurant 100.0% 99.6% 100.0% 96.9% 99.9% 90.5% 63.9% 62.5% 73.1% 603/∞ 1816/2030 559/∞ 60.0%
AddressBook 100.0% 100.0% 100.0% 99.9% 90.1% 98.9% 96.9% 100.0% 523/∞ 1506/1520 642/∞ 10.0%
PickupNet 100.0% 100.0% 100.0% 100.0% 100.0% 89.4% 98.3% 98.3% 93.3% 467/∞ 773/773 550/∞ 46.7%
HotelManagement 100.0% 96.5% 100.0% 97.6% 99.6% 81.2% 97.1% 96.4% 99.4% 349/∞ 1052/1052 361/∞ 73.3%
Football 99.8% 98.5% 93.8% 94.8% 99.9% 88.8% 91.5% 92.2% 90.4% 825/∞ 1933/1934 898/∞ 33.3%
MyExpenses 100.0% 100.0% 100.0% 97.8% 99.2% 88.3% 99.2% 99.7% 98.9% 255/∞ 778/778 277/∞ 53.3%
VideoClub 100.0% 100.0% 100.0% 94.8% 100.0% 83.6% 82.2% 81.7% 100.0% 281/∞ 503/503 319/∞ 86.7%
VehicleRental 98.6% 100.0% 100.0% 99.4% 100.0% 89.5% 81.7% 81.3% 100.0% 176/∞ 1068/1068 251/∞ 23.3%
Statemachine 99.9% 99.7% 100.0% 90.6% 100.0% 84.1% 100.0% 100.0% 100.0% 423/∞ 423/423 695/∞ 76.7%