Create Pipelines
  • 08 May 2025
  • Dark
    Light
  • PDF

Create Pipelines

  • Dark
    Light
  • PDF

Article summary

Overview

Dataloop enables you to create pipelines either from scratch or using a template. You can choose from your organization's templates, Dataloop’s marketplace templates, or build one entirely on your own, providing both flexibility and efficiency in developing and deploying data processing and machine learning workflows.

This article guides you through creating pipelines on the Dataloop platform.


Create Pipeline Using Templates

The Marketplace is a comprehensive repository that contains a wide variety of pre-defined pipeline templates. These templates are designed to cover common use cases and scenarios across different industries and applications, providing a quick and efficient way to get started with pipeline creation. Here’s how it works:

  1. Open the Pipelines page from the left-side menu.

  2. Click Create Pipeline and select the Use a Template from the list. It opens the Select Pipeline Template popup window.

  3. Select a template from the list.

  4. Click Create Pipeline. The selected pipeline template will be displayed, and you can configure the available nodes as needed.


Create Your Own Pipeline

Creating pipelines from scratch is an approach suited for users with specific requirements that cannot be fully met by existing templates, or for those who prefer to have complete control over every aspect of their pipeline design. Here’s how it works:

Creating a pipeline in the Dataloop platform and activating it involves many steps, such as:

  1. Create a Pipeline.
  2. Place the nodes.
  3. Verify the Starting Node
  4. Configure Pipeline
  5. Start the Pipeline
  6. Trigger the Pipeline

1. Create a Pipeline

  1. Open the Pipelines page from the left-side menu.

  2. Click Create Pipeline and select the Start from Scratch from the list. It opens the Set Pipeline Name popup window.

  3. Enter a name for the new pipeline.

  4. Click Create Pipeline. The Pipeline window is displayed, and you can configure the available nodes as needed.

2. Place the Nodes

To compose a pipeline, drag and drop nodes onto the canvas and connect them by dragging the output port of one node to the input port of the next node.
Clicking on a node output port and releasing it will create an instant connection with the closest input port available.

Canvas Navigation
  • Left-click and hold on to any node to be able to drag it around the canvas.
  • Right-click and hold on to the canvas to be able to drag the entire canvas.

3. Verify the Pipeline Starting Node

The starting icon image.png will appear next to the first node you place on the canvas. This icon can be dragged and placed on any node to mark it as the starting point of the pipeline.

When triggering data into a pipeline (for example, from the dataset-browser), the data enters the pipeline at the node set with the starting point.

4. Configure the Nodes

Configure the node inputs such as Fixed Value or Variables, Set Functions, Create Labeling Tasks, and Set Triggering Data in nodes, etc.

  1. Refer to the following node category list and configure the nodes according to the type:
    1. Data Nodes
    2. Labeling Nodes
    3. Automation Nodes
    4. Model Nodes
    5. Image Utils Nodes
  2. Learn more about the Node Inputs
  3. Learn more about the Triggering Data

5. Start the Pipeline

To install and activate your pipeline, click Start Pipeline in the pipeline editor screen or the play button from the project's Pipeline page. If you are unable to click Start Pipeline, or that the installation process has failed, it might be due to configuration issues of your pipeline nodes or errors in the pipeline composition:

  1. To monitor node configuration issues, hover over the warning/errors icons on the nodes to see what issues need to be resolved. Resolve the issue, and the warning/error icon should disappear.
  1. To monitor installation errors, click on the Error tab in the pipeline’s information panel on the right and check the error messages.

6. Trigger the Pipeline

Once you start the pipeline, you can trigger the pipeline by invoking data to pipeline by using Automatic, Manual, and via SDK invocation.

Learn more


Create Custom Application Using Code Node

Dataloop allows you to create custom node applications using your external docker images.

1. Create a Pipeline

  1. Open the Pipelines page from the left-side menu.

  2. Click Create Pipeline and select the Start from Scratch from the list. It opens the Set Pipeline Name popup window.

  3. Enter a name for the new pipeline.

  4. Click Create Pipeline. The Pipeline window is displayed, and you can configure the available nodes as needed.

2. Place the Nodes

  1. Select the starting nodes from the Node Library and drag them to the Canvas.
  2. Add a Code node in the canvas and make the required updates.
  3. Complete the connections.

Learn more

3. Customize the Code Node Using Your Docker Image

Prerequisites

To integrate your customized private container registry, you must create container registry integrations and secrets.

  1. Select the Code node and click Actions.
  2. Select Edit Service Settings from the list.
  3. Add Secrets and Integrations if your docker image is private. Ignore, if public.
  4. Click Edit Configuration.
  5. In the Docker Image field, enter your docker image URL.
  6. Complete the remaining steps, if required.

4. Start the Pipeline

To install and activate your pipeline, click Start Pipeline in the pipeline editor screen or the play button from the project's Pipeline page.

Create Model Application Node Pipelines

Dataloop allows the installations for AI/ML models by allowing them to be hosted and executed on:

  • Dataloop's Managed Compute (internal infrastructure): The Models run on the Dataloop's Compute.

  • External Compute Providers (e.g., OpenAI, Azure, GCP, IBM, NVIDIA) via API Service Integration: The Models run on external provider's compute, and which requires secret credentials to complete the installation.

1. Create a Pipeline

  1. Open the Pipelines page from the left-side menu.

  2. Click Create Pipeline and select the Start from Scratch from the list. It opens the Set Pipeline Name popup window.

  3. Enter a name for the new pipeline.

  4. Click Create Pipeline. The Pipeline window is displayed, and you can configure the available nodes as needed.

2. Place the Nodes

Dataloop allows you to install Models running on Dataloop Compute and External Compute Providers (e.g., OpenAI, Azure, GCP, IBM, NVIDIA) via API Service Integration.

  1. Select the required nodes (except model application) from the Node Library and drag them to the Canvas.

  2. Click on the + icon next to the Node Library.

  3. Select a model from the Models tab and click Add Node.

  4. Select the model variations from the list and click Proceed. The Install Model pop-up window is displayed, if the model is running on an external compute.

  5. Select an API Key, Secret or an Integration, as required. If not available,

    1. If there is no secret, click Add New Secret and follow the steps.
    2. To set the integration later, click Set Up Later.
  6. Click Install Model. Click on the View Model to view it under the Model Management → Versions tab.

  7. Go back to the Pipeline and view the newly added node under the Models' category in the left-side node library.

  8. Drag it to the canvas and make the required configurations.

  9. Complete the connections.

Learn more

3. Start the Pipeline

To install and activate your pipeline, click Start Pipeline in the pipeline editor screen or the play button from the project's Pipeline page.


Create Pipeline Using the SDK

To learn more about creating pipelines in the SDK, click here.


Undo/Redo Pipeline Editing Steps

While structuring the pipeline and making adjustments, you can use the Undo/Redo to trace back editing steps back and forth as needed, without having to manually change back the pipeline configuration. This includes nodes added or removed, nodes connectivity, and nodes settings.

Undo/Redo does not provide traceability to code changes in code nodes. Such manual changes to the code are currently not versioned.