- 30 Jul 2025
- Print
- DarkLight
- PDF
Studio Overview
- Updated On 30 Jul 2025
- Print
- DarkLight
- PDF
GenAI Evaluation Studio is a powerful, user-friendly tool for evaluating GenAI responses, provide feedbacks, etc. It allows users to design, run, and analyze model evaluations using fully customizable forms and layouts — all directly integrated with Dataloop's platform.
Key Capabilities
This studio allows you to evaluate the GenAI output—structured as a .json
item—within a custom-designed layout using the Multimodal Layout Builder. These layouts represent structured evaluation forms that can include:
Text inputs
Dropdowns, radio buttons, and checkboxes
Star ratings and sliders
Multimodal content such as conversations, URLs, images, audio, and video
Custom validations, JavaScript logic, and CSS styling
How It Works
Create the Layout Recipe:
The evaluation interface reads the layout defined in your recipe —typically generated using the Multimodal Layout Builder.Interact with the Form:
Users can:Input text
Select options
Assign ratings
Trigger dynamic logic (e.g., show/hide fields)
View media content embedded in the layout
Capture Annotations:
Every interaction within the Evaluation Studio (e.g., choosing a score, writing a comment) is logged as an annotation, maintaining a detailed record for each evaluated item.Ensure Quality:
Built-in and custom validation rules (via JavaScript) help prevent invalid submissions.
Annotators can only complete evaluations if required inputs are valid and complete.
Any modifications made within the Evaluation Studio are treated as annotations.
This includes actions such as providing a rating, selecting options, entering comments, or any other form of input captured during the evaluation process.
Use Cases
Model evaluation & benchmarking
Human-in-the-loop (HITL) reviews
Multi-modal model assessment
Data quality scoring
Feedback collection for fine-tuning
Use GenAI Evaluation Studio
The GenAI Evaluation Studio is a dedicated interface within Dataloop’s platform designed for reviewing, scoring, and validating outputs from GenAI models using structured forms and logic defined in a Multimodal Recipe.
Before You Start
To start using this studio, you must complete the following steps:
Assign the Recipe
The Dataloop platform offers flexible integration modes for applying layout recipes in evaluation workflows. You can either apply a single unified layout across all items or assign custom layouts per item, depending on your use case.
1. Unified Layout for All Items (Default Approach)
In this mode, all items in the task share the same layout recipe.
Use Case: Ideal for structured evaluations where every item follows a common annotation format.
How to Apply:
Link the task to the desired Multimodal Recipe from the task creation interface.
Optionally, also connect the dataset to the same recipe for clarity and consistency.
Recommendation: Use this as your go-to method for most evaluation scenarios.
2. Per-Item Layout Assignment (Advanced / Flexible Mode)
In this approach, different items within the same task or dataset can have distinct layout recipes.
Use Case: Suitable for mixed evaluation scenarios, such as comparing chats, images, and structured responses within the same dataset.
How to Apply:
Upload items programmatically using the SDK.
Assign a specific layout recipe to each item using system metadata tags (see below).
Important Note: This overrides the layout connected to the task or dataset.
Reference: A full example script can be found under Layout Editor → Sample Data →
</>
icon (Python script generator).
'system': {
'shebang': {
'dltype': 'evaluation-studio'
},
'evaluation': {
'layoutName': '{INSERT RELEVANT LAYOUT RECIPE ID HERE}'
}
}
Open the JSON file
Open the Dataset Browser and select the JSON file.
Double-click on the JSON file. The GenAI Multimodal Studio is displayed.
Make the required evaluation steps as per the structure. The structure will vary according to the GenAI recipe structure.
Click Save.
JSON Formats
You can export the items and items with annotations as JSON files.