- 17 Feb 2025
- Print
- DarkLight
- PDF
Organize Your Data
- Updated On 17 Feb 2025
- Print
- DarkLight
- PDF
Efficient data organization is critical for streamlining dataset management, enhancing searchability, and improving collaboration. Dataloop’s Data Browser provides a structured way to manage and access data using Folders, Collections, Metadata, and ML Subsets.
Folders
The Dataset Browser will display dataset items in the Folders view by default. You can utilize Items or Folders based view contexts to organize items in the Dataset browser. When applying a filter, it is implemented within the scope chosen by the user, whether it's the entire dataset or a specific folder.

When you click the folder icon, a tabbed view will appear, presenting file items in a structured, file system-like format:
- Items based: By default, this view is displayed, and it shows all items regardless of their folder structure, enabling the application of filters and displaying all items at the Root Folder (Dataloop). When you select a folder, it shows Items Only, and it does Not show any sub-folders if available.
- Folders based: It shows items based on the folders or subfolders you selected. When you select the Root Folder (Dataloop), it shows items and folders if available in the Root folder.

You can perform the following actions:
- Click Folder based to view items and sub-folders in a folder.
- Create folders: Select the root folder and click on the plus icon or the New Folder icon when you hover Root Folder (Dataloop).
- Create Sub-Folders: Select the folder and click New Folder icon or the new folder icon when you hover the selected folder.

Move items between folders:
- Select one or more items from the current page.
- Right-click and select File Actions > Move to Folder.
- Select the folder from the list.
- Click Move.
Modify the folder name, hover over the folder, and click on the folder edit icon.

- Select a folder and right-click to:
- Rename: Rename the selected folder.
- Move: Moved items in the selected folder to another folder.
- Copy item path: It copies the complete item path.
- Create Trigger: It allows creating a trigger function for the selected folder items.
- Delete: It allows you to delete the selected folder and items in the folder.
Collections
The Collections feature in Dataloop's Data Browser helps organize data by allowing you to group specific sets of items based on task needs (e.g., annotation, review, training) into a collection folder. You can create up to 10 collection folders.
Key Features of Collections in Data Browser
- Selective Grouping: Choose specific items from a dataset to move into a collection based on criteria like image type, labeling status, or annotation requirements.
- Easy Access:
- Collections provide a convenient way to quickly access and manage items that are organized in collection folders for relevant to a particular task, without sifting through the entire dataset.
- Collections allows you to identify the list of Unassigned Items (the items that are not yet part of any collections).
- Enhanced Collaboration: Collections can be shared with team members, allowing specific data subsets to be easily shared and collaboratively worked on without impacting the primary dataset.
- Task-Specific Organization: Create collections based on different stages of your workflow, like "Pending Annotation," "Quality Review," or "Model Training Set," which helps keep the data organized according to your project’s progress.
- Filter Collection Items Using Smart Search: Use the Items field in the smart search to apply a collection filter query to find specific items within a collection. For instance, using the query
metadata.system.collections.c0 = true
will filter items that are part of the first collection (.c0
ID for the firstly created collection folder and.c9
ID for the lastly created collection folder).
Access Collections
To access the Collections in the Data Browser:
- Open the Data Browser: Log in to your Dataloop account and navigate to the Data Browser.

- Click on the Collections Icon: In the left-side panel, click the Collections icon, situated below the Folder icon. A tabbed view will appear, and will display all your existing collections, allowing you to create new ones or manage current collections as needed.

Create Collections
Creating Collections can be customized to match the requirements of your specific task, such as grouping items by type, project phase, or other relevant attributes.
Limitations:
- You can create up to 10 collection folders.
- Each item can be tagged in a maximum of 10 collections at once.
- Open the Data Browser.
- In the left-side panel, click on the Collections icon located below the Folder icon.

- Click on Create a Collection.
- Type your desired collection's name, and press the Enter key. The new collection will now be created and displayed in Collections.
Add Items to a Collection
You can create a new collection by selecting items from your dataset and adding them to a designated collection.
- Open the Data Browser.
- Select the items you want to add to a collection.
- Right-click on the selected items.
- Select Collections and choose your desired collection. The selected items will now be added to the chosen collection.
Find Collections Using Smart Search
- Open the Data Browser.
- Click on the Items search field.
- Enter the query code as
metadata.system.collections.c0 = true
where c0 is collection ID. The available collections will be listed as a dropdown.
Clone a Collection
- Open the Data Browser.
- In the left-side panel, click on the Collections icon located below the Folder icon.
- Hover-over the collection you want to clone.
- Click on the three dots and select Clone from the list.
- Click
original_name-clone-1
. to confirm the cloning process. The cloned collection will be created and named as
Rename a Collection
- Open the Data Browser.
- In the left-side panel, click on the Collections icon located below the Folder icon.
- Hover-over the collection that to be renamed.
- Click on the three dots and select Rename from the list.
- Make the changes and press Enter key.
Remove Items from a Collection
- Open the Data Browser.
- In the left-side panel, click on the Collections icon located below the Folder icon.
- Click on the collection containing the items you want to remove.
- Select the items, then right-click on them.
- Select Collections -> Remove From Collections option from the list.
- Select the specific collection from which you want to remove the items (if they belong to multiple collections).
- Click . A successful deletion message will be displayed.
Remove Collections from Items
- Open the Data Browser.
- Select Item(s) from the browser.
- Right-click and select Collections -> Remove from Collections.
- Select the Collection(s) that are to be removed.
- Click . A confirmation message is displayed.
Delete a Collection
- Open the Data Browser.
- In the left-side panel, click on the Collections icon located below the Folder icon.
- Hover-over the collection you want to delete.
- Click on the three dots and select Delete from the list.
- Click to confirm the deletion process.
SDK Code to Manage Collections
Leverage the Dataloop SDK to create, update, delete, and manage collections at both the dataset and item levels.
Metadata
Item metadata consists of descriptive information and attributes associated with individual items in a dataset. Dataloop enables users to organize and categorize data items using metadata tags, making it easier to filter and analyze datasets efficiently.
With metadata-based filtering, users can define Filter Queries to refine searches based on specific attributes, such as field names or annotation labels. For example, a query can be created to filter items based on a particular metadata field or assigned annotations.
You can update item metadata using either the UI or SDK allowing for efficient querying and retrieval of relevant data when needed.

Add Custom Metadata
Adding custom metadata involves attaching additional information or tags to various types of data items. Custom metadata can be user-defined and is not limited to the predefined categories or attributes provided by the Dataloop platform.
To attach metadata to any entity, such as Datasets, you can utilize the SDK's 'Update' function. To learn how to upload items with metadata, read here.
// Example
dataset.metadata["MyBoolean"] = True
dataset.metadata["Mycontext"] = Blue
dataset.update()
Display Custom Metadata
The datasets page provides a list of all the datasets present within the project. The table contains default columns, including dataset name, the count of items, the percentage of annotated items, and additional information.
To include and display columns with your custom context (metadata fields):
- From the Project Overview, click on Settings.
- Select Configuration.
- Select the Dataset Columns from the left-side menu.
- Click .
- Click .
- Enter the required information as follows.
- Name: A general name for this column (not visible outside the project-settings).
- Label: The column header displayed on the Datasets page.
- Field: The Metadata field to map to this column.
- Configure the desired feature settings as needed:
- Link: If the field value is a URL and should open in a new tab, select this option.
- Resizable: Check this option if the column needs to be resizable, useful for displaying long values.
- Sortable: Enable this option to allow sorting the table by clicking the column header.
- Click . A successful message is displayed.
After completing the above steps, the Datasets table on the Datasets page will display the custom column and the data you've populated there.
- To ensure that any new data added via SDK is reflected, refresh the page.
- You can use the search box to search for datasets that match your search term, provided that the search term is included in any of the custom columns you've added to the table. This allows you to filter datasets based on the custom metadata you've defined.
ML Subsets
The ML Subsets View in Dataloop's Data Browser is a dedicated feature designed to enhance machine learning workflows by organizing and managing your dataset effectively. It allows you to classify and filter dataset items based on their ML Subset assignments, such as train, validation, and test, which are commonly used in the ML lifecycle for model development and evaluation.
Filtering by ML Subset Assignment:
Easily locate items in your dataset based on their subset classification:
- Unassigned items: Items that have not been allocated to any specific subset.
- Train: Items designated for training the ML model.
- Validation: Items used to tune hyperparameters and evaluate the model during training.
- Test: Items reserved for final evaluation of the trained model's performance.

Use Cases:
- Training Pipeline: Quickly access the train subset to prepare data for model training.
- Model Validation: Focus on the validation subset to monitor the model’s performance during tuning.
- Final Testing: Access the test subset to evaluate the final accuracy, precision, or other metrics.
Select and right-click on any item from the folders and perform actions available.
ML Data Split Chart
ML Data Split Chart displays the distribution of items across machine learning subsets, including Test, Train, and Validation, showing both the total numbers and percentages.
- This chart is generated when an item is assigned to one of the ML subsets.
- When hovering over a subset, the chart specifically highlights the subset being hovered over.
- Clicking on a subset grays it out, removing it from the chart.
- Clicking on it again restores its visibility on the chart.

Why Use the ML Subsets View?
- It simplifies dataset management by clearly segregating data for training, validation, and testing, reducing errors in ML workflows.
- Enhances collaboration within teams by providing a consistent structure for dataset organization.
- Saves time by offering intuitive filtering and search options for specific subsets.
View Items by ML Subsets
- Go to the Data Browser.
- Select the Model icon from the left-side panel. The available items are displayed in folder wise.

If there are no items added to ML Subsets, click Split Into Subsets.
Split Items Into Subsets
Split Data Into Subsets feature allows users to divide their dataset into multiple subsets, such as train, validation, and test, based on a specified distribution. This splitting is important for ensuring that the dataset is well-prepared for machine learning or data analysis tasks. Custom Distribution: By default, the items are divided as follows:
- Train set: 80% of the data, which is used to train the machine learning model.
- Validation set: 10% of the data, which is used during training to fine-tune model hyperparameters and prevent overfitting.
- Test set: 10% of the data, which is used to evaluate the final model performance after training.
- In the Dataset Browser, select one or more items.
- Click .
- Select Models -> Split Into Subsets. The ML Data Split pop-up is displayed.
- Customize the distribution by moving the slider. By default, the items are divided as mentioned above.
- Click . A confirmation message is displayed, and the selected items are divided into respective subsets.
- Click on the ML Data Split section in the right-side panel to view the items' distribution.
SDK Code to Manage ML Subsets
This SDK code demonstrates how to filter dataset items, split them into ML subsets, assign specific items to a subset, remove an item from a subset, and retrieve items missing an ML subset in Dataloop.