- 06 Jan 2025
- Print
- DarkLight
- PDF
Quick Start
- Updated On 06 Jan 2025
- Print
- DarkLight
- PDF
Overview
Welcome to Dataloop! This page will help you to learn about the functions available in the Dataloop platform quickly. Follow the steps below to start using our platform efficiently.
- Create or Join an Organization
- Add Members to an Organization
- Create a Project
- Join a Project
- Upload or Sync (cloud storage) Your Data into Dataloop
- Manage Your Data
- Create Datasets
- Create Cloud Storage Drivers
- Create Cloud Integrations
- Setup Taxonomy - Labels & Attributes
- Create an Annotation Task
- Annotation Studios
- Annotation Tools
- Create a QA/QC Task
- Create an Application Service
- Install an Application Service
- Install a Model
- Create a Pipeline
- Support
Getting Started
The Getting Started page helps you explore everything Dataloop has to offer across various areas. Discover how Dataloop can assist you with Data Management, MLOps, Pipelines, the Annotation Platform, and more.
The page provides an easy way to navigate Dataloop with built-in pipelines, annotated datasets, and pre-trained models available for you to experience and discover the platform's capabilities.
To access the Getting Started page, click the Getting Started from the left-side menu. Users with the Annotator project role do not have access to the Getting Started page.
Here's a breakdown of the sections and options available on this page:
Data Management
Use Dataloop Example
Start your data management journey in seconds with Dataloop Example. Gain hands-on experience with file management, data and feature visualization, and execute sub-second queries on millions of items and annotations. Effortlessly build and manage training sets. Experience the power of Dataloop firsthand.
- lick on Go to Dataset to view the selected dataset in the Dataset Browser.
Connect Your Data
Easily integrate your data for streamlined management. Upload files or link your cloud storage (AWS, GCS, Azure) for smooth curation and handling. Effortlessly organize, visualize, and query your data using Dataloop’s robust tools, designed to meet all your data management requirements.
Clicking on Connect opens the Data Management Resource Creation page, where you can link your external dataset by setting up integrations and creating storage drivers for your preferred cloud provider.
Programmatically Connect Your Data
Seamlessly integrate your data using our Python SDK. Explore a comprehensive, step-by-step notebook to connect your data sources with Dataloop via our Python SDK. Discover how to create Datasets, integrate, and organize your data within Dataloop, optimizing your data management workflow.
Clicking on Open Notebook opens the Data Management page and the Python Jupyter Notebook, allowing you to use the code to modify as required to connect your data.
MLOps
In the MLOps section under the Getting Started page in Dataloop, you'll find resources that guide you through different aspects of managing the machine learning lifecycle. The tabs like Train & Evaluate a Model provide structured steps to help you build, train, and validate machine learning models using Dataloop’s integrated tools.
The system directs users who identify as ML/DS Developer to the MLOps section by default when they navigate from the onboarding process.
Train & Evaluate a Model
Dataloop's MLOps platform enables you to efficiently train, fine-tune, deploy, and manage machine learning models, making it easier to automate workflows and optimize the entire model lifecycle.
Clicking on Start Now opens the Model Management page, allowing you to select an installed model version that fits your needs.
Programmatically Train Your Model
Streamline the process of training, evaluating, and deploying your models using Dataloop’s Python SDK. Open a notebook with ready-to-use code that covers everything from model creation to deployment, enabling you to focus on optimizing and executing your MLOps workflows in just a few simple steps.
Clicking on the Open Notebook opens the Model Management page and the Python Jupyter Notebook on the right-side, allowing you to use the code to modify as required and train your model.
Annotation Platform
Dataloop annotation platform (also referred to as annotation-studio and data-application) is designed to provide annotators with a smooth and streamlined experience, and to ensure pixel-accurate work at high velocity. The Annotation Platform on the Getting Started page helps you quickly get acquainted with Dataloop’s annotation tools using a sample dataset and by selecting from a variety of available annotation workflows.
Users designated as Annotation Managers will be automatically directed to this tab by default when navigating from the onboarding process.
Work With Annotation Tools
Discover the power of Dataloop’s annotation tools with a sample dataset. Explore the full potential of Dataloop’s annotation studios. Start with a sample dataset to explore a variety of annotation solutions, including image, video, audio, NLP, and LiDAR. Optimize your workflow with intuitive features designed to enhance accuracy and streamline the annotation process.
Clicking on Select Tool opens the Select Annotation Tool window. Select a demo dataset for the annotation tool and click Start Annotating. The corresponding annotation studio will then open and be ready for annotation. For more information, see the Annotation Studio - Basics.
Install Annotation Workflows
Select an annotation workflow from various available options. Optimize your annotation workflow by automating tasks with Dataloop’s pipeline automation tool. Assign and manage tasks automatically, manage your data, and ensure quality control. With Dataloop’s pipeline-based templates, your data annotation process becomes faster and more organized with just a few clicks.
Clicking on Select Workflow opens the Select Annotation Workflow window. Select an annotation workflow from the list and click Open Workflow. The Annotation Workflow will be installed, and the Pipeline page is displayed. For more information, see the Create and Manage Pipelines or Composing Pipelines.
Programmatically Create an Annotation Workflow
Leverage Dataloop's Annotation Platform to create and automate custom annotation workflows using the Dataloop Python SDK. You can programmatically configure every aspect of the workflow, from task assignment and pre-annotation to data processing and management, granting you full control over the entire process.
Clicking on Create Workflow opens the Pipelines page and the Python Jupyter Notebook on the right-side. It allows you to follow the steps available on the Jupyter Notebook and create the pipeline as required.
GenAI Solutions
The GenAI Solutions section under the Getting Started page in Dataloop provides an overview of how the platform can help users leverage Generative AI technologies, such as large language models (LLMs) and other advanced AI techniques. One of the key tabs in this section is RAG-Based LLM.
RAG Based LLM
The RAG-Based LLM (Retrieval-Augmented Generation) in Dataloop enhances large language models by combining data retrieval with text generation, ensuring more accurate and context-specific outputs. By integrating your own data into the RAG pipeline, the model retrieves relevant information before generating responses, making the outputs tailored to your unique needs. Dataloop’s pipeline tool simplifies the integration process, allowing you to build a customized model that consistently provides precise, contextually relevant results.
When you click on Install Solution for the RAG-Based LLM, it opens the RAG GenAI solutions pipeline template page, where you can customize the nodes and configurations accordingly.
Pipelines
Pipelines provide the capability to create automated models that seamlessly integrate human and machine elements for data processing within a structured pipeline framework. For more information, see the Pipeline page.
Start From a Pipeline Template
Quickly set up automated ML workflows with templates. Automate your entire ML pipeline - from data ingestion, preprocessing, and pre-annotation to labeling, model training, and evaluation. Our templates offer a quick start for automating end-to-end ML workflows, making it easier to scale your projects.
Clicking on Select Pipeline opens the Pipeline Templates page, allowing you to select a template that fits your needs. Clicking Create Pipeline generates a new pipeline in your project using the selected template.
Create Your Own Pipeline
Automate and customize ML pipelines with pre-build nodes. Easily build and automate your ML workflows using our pre-built pipeline nodes, or integrate your custom code. Customize each stage, from data ingestion and preprocessing to pre-annotation, labeling and model training, to create a fully automated pipeline.
Clicking on Create Pipeline opens the Pipeline page in the edit mode, where you can select available nodes from the side panel to build your own pipeline. By default, the Getting Started is displayed as the pipeline name.
For more information, see the Create and Manage Pipelines.
Programmatically Create a Pipeline
Dataloop's Python SDK empowers you to create custom machine learning pipelines from scratch, automating the entire process. By opening a notebook, you can develop tailored pipelines that suit your specific ML workflows.
Clicking on the Create Pipeline opens the Pipelines page and the Python Jupyter Notebook, allowing you to use the code to modify as required and create your pipeline.