Models

Overview

AI models are data-trained programs that perform tasks requiring human-like intelligence, such as predictions, language understanding, or image recognition.

In Dataloop, the Models tab within the Marketplace displays foundational model architectures available at both the organization and project levels. These serve as starting points for managing, installing, versioning, and training models with your own datasets.

Users can select a public model architecture to create an initial model version, which can then be trained to support pre-annotations and accelerate labeling workflows.

In the card view, the following details about each model are presented:

Model Image
Model Name
Model Category (for example, Provider, Media Type, License)
Trainable or Non-Trainable Models
Last Updated Timestamp

Additionally, upon selecting a model card from the list, a detailed information panel will appear on the right side.

Auto Update option
Available Actions
The Model Description and a GitHub location link.
Details, such as:
- Status
- Privacy: Public, Project, or Organization
- Created, Installed, and Updated Date
- Email ID of the user who installed it.
Content information, such as:
- Component Name
- Installation Status
- Model Names
- A link to view models page.

Search and Filter Models

Use the Search a Model field to type in the name of the model to search it.

By Category

Marketplace allows you to filter models using categories. In the left-side panel, search models using the following categories. Select one or more categories to refine the list of models according to your choice. To remove the filter, simply click on the selected filter category once more.

Provider: Filter models based on the organizations or companies that develop, train, host, and distribute AI technologies. Examples include Dataloop, Google, Meta, AWS, etc.
Deployed By: Narrow down models by filtering for organizations and companies that have deployed their applications or models within the Dataloop marketplace. It helps to identify models that are running on either Dataloop Compute (Managed Compute), or External Compute provider via API (NVIDIA, Microsoft, etc.).
Media Type: Select models based on the type of data they handle, such as Audio, Image, Text, Video, etc. Each media type corresponds to specific AI models and techniques designed to process, generate, or enhance that form of data.
Computer Vision: Focus on models that specialize in analyzing visual data like images and videos to extract meaningful information. Examples include Object Detection, Classification, Semantic Segmentation, etc.
NLP (Natural Language Processing): Filter models that are designed to understand, interpret, and generate human language. Examples include Text Classification, Token Classification, Translation, etc.
Audio: Choose models based on their ability to process and analyze audio data. This includes models for tasks like Text to Speech, Audio Classification, Automatic Speech Recognition, and Embeddings.
Gen AI: Filter models based on their classification as Large Language Models (LLM) or Large Multimodal Models (LMM) within the broader category of General AI. This helps identify models with advanced capabilities in language and multimodal tasks.
License: Select models according to their licensing agreements, ensuring that the chosen model aligns with your legal, project, and ethical requirements. Examples include End User License Agreement (EULA), MIT License, Apache License 2.0, etc.

By Trainability Status

Marketplace allows you to filter models by whether they are Trainable or Non-Trainable. In the top-right-side, filter models using the trainability status.

Trainable: These models can be updated or fine-tuned with new data, allowing them to improve or adapt over time.
Non-Trainable: These models cannot be further trained. They are used as-is, suitable for situations where the model's behavior should remain consistent over time.

By Model Type

When managing machine learning models in a platform or Marketplace, you can enhance your workflow by filtering models based on their Model Type. This allows you to quickly identify and select models that align with your specific use case. The available model types are:

Predictive: Primarily used to forecast outcomes or classify data based on historical or input data. These models learn from existing data to make informed predictions about new, unseen data. Common Use Cases are Classification, Regression, Anomaly Detection, etc.
Generative (Gen AI): Generative models, also known as Generative AI, are designed to create new data instances that resemble a given dataset. These models learn the underlying distribution of the data and generate new samples that follow this learned distribution. Common Use Cases are Text Generation, Image Generation, Data Augmentation, Content Creation, etc.
Embeddings: Embeddings models are used to convert complex data, like text, images, or other types, into fixed-size numerical vectors (embeddings) that capture the essential features of the data. These embeddings can then be used for various downstream tasks. Common Use Cases are Similarity Search, Clustering, Data Curation, Retrieval-Augmented Generation (RAG), etc.

Create Your Own Model

Dataloop enables you to develop and deploy your own models using the SDK. These models can be configured to run on Dataloop's built-in compute infrastructure or on external compute environments.

The process includes defining a model adapter and publishing it as an application, making it readily accessible and usable within your Dataloop projects.

Create a model adapter

A model adapter acts as the bridge between your custom model and Dataloop's platform.
It ensures that your model's API endpoints and functionality are compatible with Dataloop's framework.
The adapter handles tasks like data input/output conversion, prediction processing, and integration with Dataloop pipelines.
Learn more.

Publish the application in the Dataloop marketplace

Once your model adapter is ready, package it as an App entity in Dataloop.
Publish the app in the Dataloop Marketplace, making it available for installation.
Learn more.

Install the application into your project

After publishing, go to the Marketplace and install the app into your desired project.
The app installation links your custom model to the project, allowing it to be used in pipelines, tasks, or manual workflows.
Learn more.

Start using your model

Once installed:

Available for use within the Dataloop platform.
Use it for tasks such as running predictions, active learning workflows, or custom pipelines.
Allows annotators and developers to leverage its capabilities.

Model Creation Flow

Automated Status Assignment

Automated Status Assignment is a mechanism that intelligently applies a status label to a model depending on its current state, such as Created, Pre-trained, Trained, or Deployed. This status reflects the model's readiness for use or further development, based on the presence of certain artifacts or milestones achieved in the model's lifecycle.

Model Creation

When creating a new model, the platform now automatically assigns a status based on the presence of artifacts.

If artifacts exist, indicating that the model has undergone some form of pre-training, it will be assigned the status Pre-trained.
If no artifacts are present, the model will be given the status Created.

Model Cloning

Created: Cloning a model with the status Created will result in a new model also with the status Created. This indicates that the model is in its initial stage without any training.
Trained: When a model marked as Trained is cloned, the new model will inherit the status Pre-Trained. This reflects that the model has undergone training and possesses learned weights that make it ready for further fine-tuning or deployment.
Deployed: Cloning a model with the status Deployed similarly results in a new model with the status Pre-Trained. This acknowledges that the model not only has been trained but also successfully deployed, indicating a level of robustness and reliability.
Pre-Trained: Cloning a model already marked as Pre-trained maintains the status Pre-trained for the new model. This consistency underscores the model's readiness and the transferability of its pre-trained state.

Model Installations

Dataloop allows the installations for AI/ML models by allowing them to be hosted and executed on:

Dataloop's Managed Compute (internal infrastructure): The Models run on the Dataloop's Compute.
External Compute Providers (e.g., OpenAI, Azure, GCP, IBM, NVIDIA) via API Service Integration: The Models run on external provider's compute.

Install Models Running on Dataloop Computes

When models run on Dataloop’s internal compute:

DPKs (Dataloop Processing Kits) are deployed directly by Dataloop, requiring no external authentication or cloud service integration.
The model is fully hosted, managed, and executed within Dataloop’s secure and scalable environment.
This setup is ideal for customers seeking a hassle-free solution without the need to configure or maintain external cloud resources.
All necessary compute resources (e.g., CPU, GPU) are provisioned and managed by Dataloop, with billing handled directly through the Dataloop platform.

To install, follow these steps:

Filter - Deployed By

Use the Deployed By filter from the left-side panel to narrow down models by filtering for organizations and companies that have deployed their models within the Dataloop marketplace. It helps to identify models that are running on either Dataloop Compute (Managed Compute), or External Compute provider via API (NVIDIA, Microsoft, etc.).

Open the Marketplace from the left-side menu.
In the Models tab, use the Deployed By filter from the left-side menu to filter the model.
Select the model that you want to install.
Click Install from the right-side panel. The Install Model popup window is displayed, and it lists all the models available in the DPK with a description and indicate if the selected model is trainable or not.
Select the model variations from the list and click Install Model. A confirmation message is displayed, and click on the View Model to view it under the Model Management → Versions tab.

Once you install a model, there is a green tick icon is visible on the models that are installed.

Install Models Running on External Computes

Dataloop allows customers to run models on third-party infrastructure (e.g., OpenAI, Microsoft Azure, Google Cloud, IBM) through an API Service integration. Instead of running on Dataloop's internal compute, these models operate on the customer's own cloud account.

DPKs are deployed by the external provider, and compute tasks (inference, training) run outside Dataloop.
A lightweight API wrapper within Dataloop bridges the platform to the external model endpoint.
Customers provide access credentials, enabling secure, seamless integration.
Compute costs are charged to the customer’s cloud provider, not Dataloop—this leverages existing billing accounts and avoids new payment setups.
Customers can bring their own models from platforms like Google, IBM, or custom-hosted APIs.
It is possible to complete the installation by selecting the Set up later option, these models will remain inactive and cannot be used until the necessary integrations are fully configured.

To install, follow these steps:

Filter - Deployed By

Use the Deployed By filter from the left-side panel to narrow down models by filtering for organizations and companies that have deployed their applications or models within the Dataloop marketplace. It helps to identify models that are running on either Dataloop Compute (Managed Compute), or External Compute provider via API (NVIDIA, Microsoft, etc.).

Open the Marketplace from the left-side menu.
In the Models tab, use the Deployed By filter from the left-side menu to filter the model.
Select the model that runs on an external compute.
Click Install from the right-side panel. An Install Model Application popup window is displayed, and it lists all the models available in the DPK with a description and indicate if the selected model is trainable or not.
Click Proceed. The Select secrets pop-up window is displayed.
Select a secret or an integration, as required. If not available,
1. If there is no secret, click Add New Secret and follow the steps.
2. To set the integration later, click Set Up Later.
Select the model variations from the list and click Install Model. A confirmation message is displayed, and click on the View Model to view it under the Model Management → Versions tab.

Once you install the model, there is a green tick icon is visible on the applications that are installed.

Edit Integration Credentials

To set access integration for installed models of external compute providers:

Open the Marketplace from the left-side menu.
In the Models tab, select the installed model.
Click Actions from the right-side panel and select Edit Access Credentials. The Edit Model pop-up window is displayed.
Select the secret or integration, as required.
Click Save Changes.

Add Model Versions

After the Model DPK installation, you can add another models for this DPK, based on the DPK model components.

Open the Marketplace from the left-side menu.
In the Models tab, select the model that you want to add a new version.
Click Add Model from the right-side panel. An Add Model(s) popup window is displayed, and it lists all the models available in the DPK with a description and indicate, if the selected model is trainable or not.
Select the model variations from the list and click Add Model. A confirmation message is displayed, and click on the View Model to view it under the Model Management → Versions tab.

Once you install a model, there is a green tick icon is visible on the models that are installed.

For more actions, refer to the Manage Marketplace article.

Actions Based on Installation Status and Usage Scope

You can perform actions that are available on each model based on the scope and installation status of the model. To learn more, refer to the Actions's Availability Based on Use Cases section.