Schema Based Search
  • 09 May 2024
  • Dark
    Light
  • PDF

Schema Based Search

  • Dark
    Light
  • PDF

Article summary

Overview

Filters are integral components of the Dataset Browser, providing users with the capability to refine and narrow down the displayed items based on specific criteria. These filters offer a powerful tool for managing and exploring large datasets efficiently.


Filtering Criteria

The Data Browser allows you to search items using its data. You can refer to the following sections to learn how to search and filter items in your dataset.

You can utilize the filter functionality by specifying criteria related to the Items, annotations, and tasks' data associated with the item.

  • Item Data: By default, the Items filter is enabled. It includes details like creation date, creator, or any other relevant information associated with the item. The Items filter is permanent for all the search queries.
  • Annotation Data (Optional): Annotation data is information related to any annotations applied to the dataset items. Annotations could include labels, classifications, or any additional data that has been added to enhance the understanding or categorization of items within the dataset. You can deselect the Annotation filter, if required.
  • Tasks (Optional): You can filter the items in the dataset by using the Task's ID and Name. If necessary, you can remove the Tasks filter. When you activate the tasks filter, it will turn off the Folders based view option in the left-side panel.

Also, click on Add Filters to access additional search filter applications.


Search Query Variables

Filter Data TypeFilter VariableDescriptionConditionsData Types
ItemsannotatedFilter items based on whether they are annotated or not.=, !=Boolean values (true or false)
ItemsannotationsCountFilter items based on the number of annotations.=, !=, >, >=, <, <=, IN, NOT-IN, EXIST, DOESNT-EXISTString
ItemscreatedAtFilter items based on the date and time of their creation.=, !=, >=, <=, <, >dd/mm/yyyy
ItemscreatorFilter items based on the creator of the item.=, !=, IN, NOT-IN, EXIST, DOESNT-EXISTString
ItemsdatasetidFilter items based on the dataset ID.=, !=, IN, NOT-IN, EXIST, DOESNT-EXISTString
ItemsdescribedFilter items based on the presence or absence of a description.=, !=Boolean values (true or false)
ItemsdescriptionFilter items by searching for those that contain a specific part of the description text.=, !=, IN, NOT-IN, EXIST, DOESNT-EXISTString
ItemsdirFilter items based on their folder location within the dataset.=, !=, IN, NOT-INString
ItemsFilePathFilter items by the file location.=, !=, IN, NOT-INString
ItemhiddenFilter out items that are marked as hidden.=, !=, IN, NOT-INBoolean values (true or false)
ItemsItemHeightFilter items based on the height value of each item.=, !=, IN, NOT-INString
ItemsItemIDThe unique ID of the item.=, !=, IN, NOT-INString
ItemsItemWidthFilter items according to the width value of each item.=, !=, IN, NOT-INString
ItemsMediaTypeThe filter allows searching based on their media types, such as video or image.=, !=, IN, NOT-INString
ItemsmetadataFilter items based on metadata information (for example, metadata.system and metadata.description) contained in the items' JSON file.=, !=, >=, <=, <, >, IN, NOT-INString
ItemsModelTestSetFilter items that are designated for testing the model.=, !=Boolean values (true or false)
ItemsModelTrainSetFilter items that are designated for training the model.=, !=Boolean values (true or false)
ItemsModelValidationSetFilter items that are designated for validating the model.=, !=Boolean values (true or false)
ItemsnameFilter items based on their name.=, !=, IN, NOT-INString
ItemstypeFilter items based on their types, such as a file or folder.=, !=, IN, NOT-INString
ItemsupdatedAtFilter items based on the item's last updated date.=, !=, >, >=, <, <=, EXIST, DOESNT-EXISTString
ItemsupdatedByFilter items based on the email ID of the user who last updated the item.=, !=, IN, NOT-IN, EXIST, DOESNT-EXISTString
AnnotationsConfidenceFilter annotations based on its confidence level. The Confidence is the measure of certainty or accuracy in the labels assigned to data, typically expressed as a percentage. Higher confidence scores indicate greater reliability of the annotations.=, !=, >, >=, <, <=, IN, NOT-IN, EXIST, DOESNT-EXISTString
AnnotationsAnnotationIdFilter annotation based on the annotation ID.=, !=, IN, NOT-IN, EXIST, DOESNT-EXISTString
AnnotationscreatedAtFilter annotations based on the date and time of their creation.=, !=, >=, <=, <, >dd/mm/yyyy
AnnotationscreatorFilter annotations by the user's email ID who created the annotation.=, !=, IN, NOT-INString
AnnotationsdatasetIdFilter annotations based on the dataset ID.=, !=, IN, NOT-IN, EXIST, DOESNT-EXISTString
AnnotationsidFilter annotations based on the annotation ID.=, !=, IN, NOT-IN, EXIST, DOESNT-EXISTString
AnnotationsitemIdFilter annotations based on the item ID.=, !=, IN, NOT-INString
AnnotationslabelFilter annotations based on the labels.=, !=, IN, NOT-INString
AnnotationsmetadataFilter items based on annotation metadata information, such as metadata.system.attributes (annotation's attributes data) and metadata.system.status (annotation's). For more information, see the annotation metadata.=, !=, IN, NOT-INString
AnnotationsmodelNameFilter annotations based on the model names.=, !=, IN, NOT-IN, EXIST, DOESNT-EXISTString
AnnotationsparentIdFilter the data using the ID of the parent annotation.=, !=, IN, NOT-INString
AnnotationssourceFilter the data using the by where the annotation was created: UI/SDK.=, !=, IN, NOT-INString
AnnotationstypeFilter the data using the types of annotations.=, !=, IN, NOT-INString
AnnotationsupdatedByFilter annotations based on the email ID of the user who last updated the item.=, !=, IN, NOT-IN, EXIST, DOESNT-EXISTString
TasksTaskIDFilter the data using the Tasks' ID=, !=, IN, NOT-IN, EXIST, DOESNT-EXISTString
TasksTaskNameFilter the data using the Tasks' Name=, !=, IN, NOT-IN, EXIST, DOESNT-EXISTString

Search Items by Item's Data

  1. In the Data Browser, click on the Items field.
  2. Select or enter the required search query.
  3. Click Search to view the search result.

How to Search Items by Annotation Status -> Annotated or Not?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
annotated = true 

or

annotated = false 
  1. Click Search to view the search result.

How to Search Items by Annotation's Count?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
annotationsCount = 5
  1. Click Search to view the search result.

How to Search Items by Creation Date?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
createdAt = (03/12/2023)
  1. Click Search to view the search result.

How to Search Items by Creator?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
creator = 'your@email-ID'
  1. Click Search to view the search result.

How to Search Items by File Name?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
name = '63dbdd10be95cbe35df0a78b.jpg' 
  1. Click Search to view the search result.

How to Search Items by Dataset ID?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
datasetId = '65950d1d5c356a5e51f6727e'
  1. Click Search to view the search result.

How to Search Items by Description is Available or Not?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
described = true 

or

described = false  
  1. Click Search to view the search result.

How to Search Items by Description's Text?

  1. In the Data Browser, click on the Item field.
  2. Select or enter the query as follows:
description = 'description-text'
Whole description

Ensure to enter the full description to receive the result.

  1. Click Search to view the search result.

How to Search Items by Folder Directory?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
dir = '/sub-folder name' 

or

dir IN '/sub-folder name/sub-folder name'  
  1. Click Search to view the search result.

How to Search Items by File Path?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
FilePath = '/folder name/sub-folder name/fileName.jpg' 
  1. Click Search to view the search result.

How to Search Hidden or Not Hidden Items in a Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
hidden = true

or

hidden = false
  1. Click Search to view the search result.

How to Search Items by Height?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
ItemHeight = 234

Use the height value (for example, "height": 234) from the item metadata.

  1. Click Search to view the search result.

How to Search Items by Item ID?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
ItemID = '65798f8f81b02fbafe34fcad'
  1. Click Search to view the search result.

How to Search Discarded Items (Item's Status) in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
ItemStatus = 'Discard' 
  1. Click Search to view the search result.

How to Search Completed Items (Item's Status) in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
ItemStatus = 'Complete' 
  1. Click Search to view the search result.

How to Search Approved Items (Item's Status) in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
ItemStatus = 'Approve' 
  1. Click Search to view the search result.

How to Search Items by Width?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
ItemWidth = 234

Use the width value (for example, "width": 234) from the item metadata.

  1. Click Search to view the search result.

How to Search JSON (Media Type) Files in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
MediaType = 'application/json' 
  1. Click Search to view the search result. It displays the all the JSON files.

How to Search PCD (Media Type) Files in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
MediaType = 'application/pcd' 
  1. Click Search to view the search result. It displays the all the PCD files.

How to Search Audio Files in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
MediaType = 'audio/*'
  1. Click Search to view the search result. It displays the all the audio files.

How to Search Image Files in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
MediaType = 'image/*'
  1. Click Search to view the search result. It displays the all the image files.

How to Search Text Files in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
MediaType = 'text/*'
  1. Click Search to view the search result. It displays the all the text files.

How to Search Video Files in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
MediaType = 'video/*'
  1. Click Search to view the search result. It displays the all the video files.

How to Search Video Files by Frames Per Second (Metadata - fps) in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:

Use the fps value (for example, "fps": 25) from the item metadata.

  1. Click Search to view the search result.

How to Search Video Files by Start Time (Metadata) in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:

Use the fps value (for example, "startTime": 5) from the item metadata.

  1. Click Search to view the search result.

How to Search Items by System Metadata's Mimetype in the Dataset?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.mimetype = 'image/jpeg' 

Use the mimetype value (for example, "mimetype": "image/jpeg", "mimetype": "text/html", or "mimetype": "audio/mp3" ) from the item metadata.

  1. Click Search to view the search result.

How to Search Items by Annotation Status from the System Metadata?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.annotationStatus = 'completed' 

Use the annotationStatus value (for example, "completed") from the item metadata.

  1. Click Search to view the search result.

How to Search Items by File Size from the Item Metadata?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.annotationStatus = 'completed' 

Use the annotationStatus value (for example, "completed") from the item metadata.

  1. Click Search to view the search result.

How to Search Items by Frames Per Second (fps) from the Item Metadata?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.fps = 15

Use the fps value (for example, "fps": 15) from the item metadata system.

  1. Click Search to view the search result.

How to Search Items by the Duration from the Item Metadata?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.duration = 30.8

Use the duration value (for example, "duration": 30.8) from the item metadata system.

  1. Click Search to view the search result.

How to Search Items by the Width from the Item Metadata?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.width = 320

Use the Width value (for example, "width": 320) from the item metadata system.

  1. Click Search to view the search result.

How to Search Items by the Height from the Item Metadata?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.height = 240

Use the Height value (for example, "height": 240) from the item metadata system.

  1. Click Search to view the search result.

How to Search Items by the Dataset Tags (Train, Validation, or Test) from the Item Metadata?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.tags.train = true 

Use the Tags value from the item metadata system.
for example,

    "tags":
        "train": true

Or

   "tags":
       "validation": true

Or

    "tags":
        "test": true

Use the Tags value from the item metadata system.

  1. Click Search to view the search result.

How to Search Items by the Reference ID from the Item Metadata?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.refs.id = 649c421b9084a344b862289b

Use the refs ID value (for example, "id": "649c421b9084a344b862289b") from the item metadata system.

  1. Click Search to view the search result.

How to Search Items by the Reference Type from the Item Metadata?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.refs.type = 'assignment' 

Use the Type value (for example, "type": "assignment") from the item metadata system.

  1. Click Search to view the search result.

How to Search Items by the Encoding from the Item Metadata?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.encoding = 7bit

Use the Encoding value (for example, "encoding": "7bit") from the item metadata system.

  1. Click Search to view the search result.

How to Search Items by the Original Name from the Item Metadata?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
metadata.system.originalname = video-tutorial-v1.mp4

Use the originalname value (for example, `"originalname": "video-tutorial-v1.mp4") from the item metadata system.

  1. Click Search to view the search result.

How to Search Items are Designated for a Model Test Set?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
ModelTestSet = true 

or,

ModelTestSet = false 
  1. Click Search to view the search result.

How to Search Items are Designated for a Model Train Set?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
ModelTrainSet = true 

or,

ModelTrainSet = false 
  1. Click Search to view the search result.

How to Search Items are Designated for a Model Validation Set?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
ModelValidationSet = true 

or,

 ModelValidationSet = true 
  1. Click Search to view the search result.

How to Search Items by Updated Date?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
updatedAt = (10/04/2024) 
  1. Click Search to view the search result.

How to Search Items by the User Who Updated?

  1. In the Data Browser, click on the Items field.
  2. Select or enter the query as follows:
updatedBy = 'name@dataloop.ai'
  1. Click Search to view the search result.

Search Items by Annotation's Data

  1. In the Data Browser, click on the Annotation field.
  2. Select or enter the required search query.
  3. Click Search to view the search result.

How to Search Annotations by Annotation ID?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
annotationId = '6638e5f4fb9b79bc00d04288'

You can copy the annotationId value from opening a labeled item -> select an annotation from the right-side panel -> click on the 'i' (info) icon -> copy annotation ID.

  1. Click Search to view the search result.

How to Search Annotations by Label's Confidence Value?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
Confidence = '402.33'
  1. Click Search to view the search result.

How to Search Annotations by Creation Date?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
createdAt = (03/12/2023)
  1. Click Search to view the search result.

How to Search Annotations by Annotation Creator?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
creator = 'creator's email ID' 
  1. Click Search to view the search result.

How to Search Annotations by Dataset ID?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
datasetId = '65950d1d5c356a5e51f6727e'
  1. Click Search to view the search result.

How to Search Annotations by the Item ID?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
ItemID = '65798f8f81b02fbafe34fcad'
  1. Click Search to view the search result.

How to Search Annotations by the Label Name?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
label = 'Cat' 
  1. Click Search to view the search result.

How to Search Annotations by the Model Name?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
ModelName = 'yolov3' 

You can copy the model name value from the JSON file of the item.

  1. Click Search to view the search result.

How to Search Annotations by the ID of the Parent Annotation (Parent ID)?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
ParentId = '1234'
  1. Click Search to view the search result.

How to Search Annotations by the Source (Where the Annotation is Created) of the Annotation?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
type = 'ui' 

or

type = 'sdk' 
  1. Click Search to view the search result.

How to Search Annotations by the Label's Attribute?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
metadata.system.attributes.1 = 'Yes' 
Attributes

The attributes are defined according to the customizations made in the recipe.

  1. Click Search to view the search result.

How to Search Annotations by the Annotation Type?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
type = 'cube' 
  1. Click Search to view the search result.

How to Search Annotations by the Update Date?

  1. In the Data Browser, click on the Annotations field.
  2. Select or enter the query as follows:
updatedAt = (03/12/2023)
  1. Click Search to view the search result.

Search Items by Task's Data

  1. In the Data Browser, add the Tasks filter.
  2. In the Tasks' field, select task's ID or Name criteria from the list.
  3. Enter the value and click Search to view the search result.

How to Search Items by Task's ID?

  1. In the Data Browser, add the Tasks filter.
  2. In the Tasks' field, select the TaskID from the list.
  3. Enter the value as follows.
TaskID= 'task's ID' 
  1. Click Search to view the search result.

How to Search Items by Task's Name?

  1. In the Data Browser, add the Tasks filter.
  2. In the Tasks' field, select the TaskName from the list.
  3. Select or enter query as follows:
TaskName = 'task's Name' 
  1. Click Search to view the search result.

Filter Actions

The Dataloop platform allows you to customize and save search queries within data querying or search systems, designed to streamline and optimize user interactions with large datasets.

Save a Search Query

Dataloop platform offers the ability to save specific search filter criteria, allowing for efficient and consistent future searches.

  1. In the Dataset Browser, create a search query.
  2. Click Filter Actions.
  3. Select Save. A dialogue window is displayed.
  4. Enter a name for the new filter query.
  5. Click Save. A confirmation message is displayed.

Use a Saved Search Query

Dataloop platform offers the ability to reuse specific search filter criteria.

  1. In the Dataset Browser, click on the Filter Actions.
  2. Select Saved Filters and choose the saved filter from the list. Clicking on the saved filter allows the search query to run and displays the result.

Delete a Saved Search Query

Dataloop platform offers the ability to reuse specific search filter criteria.

  1. In the Dataset Browser, click on the Filter Actions.
  2. Select the Save Filters.
  3. Find the query to be deleted and click on the Delete icon when you hover over.
  4. Click Delete Query. A confirmation message is displayed.

Use the DQL Query Editor to Search

In addition to data search using the basic UI, it is possible to filter data using the Dataloop Query Language.

  1. In the Dataset Browser, click on the Filter Actions.
  2. Select the Query Editor. The DQL Search window is displayed.
  3. Edit the query as required, or select a saved query from the list.
  4. Once edited, click Search to run the search query, or click Save As to save it.
Delete DQL Search Query

Click on the Delete Query to delete the query.

Copy the Search Query in DQL Format

  1. In the Dataset Browser, click on the Filter Actions.
  2. Select the Copy Filter. A confirmation message is displayed, and the query will be copied in DQL format.

Clear the Search Query

In then Dataset Browser, click on the Clear Filter to clear the current search query and results.


Filters Operands

The following operators are applied within and between filters, unless otherwise specified when manually modifying a DQL query.

Cross Filter Operand

A relationship between multiple filters in a single query is based on the AND operand.
For example, filtering by status Annotated AND the user john@doe.ai AND the annotation type: Box, will return items that are annotated, have Box annotation, and have john@doe.ai as an annotator in one or more of the annotations.

Inter-Filter Operand

The operand relationship between multiple values in a specific filter is based on the OR operator.
For example, filtering by labels with the values Person and Dog, the filter will return all items with annotations of either of these labels, not necessarily both at the same time.

Unsearchable Metadata

Dataloop platform enables a new functionality unsearchablePaths on the Dataset's schema where it allows users to specify certain keys or prefixes as blacklisted within the dataset's schema, effectively removing them from the searchable schema. Therefore:

  • Data under unsearchablePaths can't be queried using Dataloop Query Language (DQL), but it can be found in the item metadata level.
  • Users can access metadata at the item level, despite unsearchability.
  • unsearchablePaths act as prefixes, all keys under this prefix will not be searchable.
  • Removing an unsearchablePath automatically makes the existing paths under the removed unsearchable path searchable in the schema.

It helps users to overcome the regular metadata limitations (1024 chars on keys or values, max 100 keys, etc. refer to the Specifications for more information).

Important

When you add or remove a path that's not searchable in the dataset, it triggers an indexing process. During this time, you won't be able to make changes to the dataset schema or add new metadata values to dataset items until the indexing completes.

Make the Metadata Key Unsearchable

You will learn how to retrieve the dataset schema, mark specific keys as unsearchable, and revert them to searchable.
It helps you to overcome the errors received on metadata while uploading items.

Step 1: Get the Dataset Schema

A dataset's schema is the structure of what kind of information it holds, such as the names of different data fields (keys), and the type of data each field contains (e.g., text, number, date, etc.). It helps you understand how the data within the dataset is organized and what kind of data you can expect in each field.

First, you need to fetch the current schema of a dataset, which details the keys and their paths within the dataset structure.

dataset = dl.datasets.get(dataset_id='datasetId')
json = dataset.schema.get()

This code retrieves the schema of the dataset identified by 'datasetId'. The schema, returned as a dictionary, shows keys and their paths within the dataset.

Step 2: Add Keys (Path) to Unsearchable Paths

If certain metadata keys should not be searchable due to privacy concerns or irrelevance to search queries, you can add these keys to the list of unsearchable paths.

dataset = dl.datasets.get(dataset_id='datasetId')
success = dataset.schema.unsearchable_paths.add(paths=['metadata.key1', 'metadata.key2'])

Here, the add() method is used to make the paths metadata.key1 and metadata.key2 unsearchable. The method returns True if the paths are successfully added.

Remove Keys (Path) from Unsearchable Paths

If you decide that certain metadata keys should be searchable again, you can remove them from the list of unsearchable paths.

dataset = dl.datasets.get(dataset_id='datasetId')
success = dataset.schema.unsearchable_paths.remove(paths=['metadata.key1', 'metadata.key2'])

This step uses the remove() method to delete metadata.key1 and metadata.key2 from the list of unsearchable paths, allowing them to be searchable again. The method returns True if the paths are successfully removed.