Manage Pipelines
  • 23 May 2025
  • Dark
    Light
  • PDF

Manage Pipelines

  • Dark
    Light
  • PDF

Article summary

This article provides a detailed guide on managing pipelines within the Dataloop platform. It covers essential aspects such as add secrets, reset a pipeline, etc.


Filtering Data Between Nodes

Hover over a connection between nodes and click the + sign to add a filter. Adding a filter means that only data assets (i.e., items or annotations) that comply with the filter condition will be passed onto the next pipeline node.

Filters can be selected from previously selected filters (saved in the Dataset browser) or written directly into the DQL editor.

For example, the following filter will only pass items whose "size" attribute in the "system" in the "metadata" is less than 1MB:

  • Notice that unlike the DQL editor in the Dataset Browser, in the pipeline DQL editor you do not need to include the attributes you filter by within a "filter" property.
  • The DQL property JOIN is not supported in the pipeline DQL filter.


Remove a filter by hovering over a connection and clicking the X icon (this will sever the connection between the output/input points, and you will need to reconnect the nodes). Alternatively, you remove the filter by setting the filter to an empty JSON { }.


🔄 Reset Pipelines

Resetting a pipeline clears all counters, as well as any pending items and cycles, without deleting existing logs or executions.

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Pipeline Name. The pipeline page is displayed.
  3. Click on the Actions and select Reset pipeline from the list.

➕Add Secrets to a Pipeline

Users can configure storage integrations with popular cloud storage services and set up private-registry integrations for more secure and efficient container management. The available types are Key-Value, AWS S3, Google Cloud Storage, Azure Blob, and Google Container Registry.

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click Manage Secrets from the list.
  4. Select an existing secrets and integrations from the list, or
  5. Click the + icon. The New Integration popup is displayed.
    1. Enter a unique name for your secret.
    2. Enter the secret value that you created at the Data Governance > Secrets page.
    3. Click Create.
  6. Click Confirm.

For more information about Dataloop secret, see the Secrets.


Pipeline Cycles

A Pipeline cycle refers to all node executions performed on a single pipeline run (usually over a specific item); the executions are listed in the order in which they occurred. Since some items may be routed differently in the pipeline based on filters and user actions, each cycle may have a different number of executions.

Cycle Status

You can filter the list of pipeline cycles based on their status. Available status are:

View Cycle Details

Select a cycle from the list to see its details, including first (node) execution time, last (node) execution time, number of executions, etc.

Clicking the play button will show the item's progress in the pipeline, highlighting the nodes involved in processing the item.

Rerun Pipeline Cycles

Users can rerun pipeline cycles by selecting executions for rerun starting points or entire cycles for rerun to avoid regenerating the cycles again in the case of failure. Rerunning the cycle in the pipeline removes the item status from a task, so it will be reinserted into an assignment.

  1. Select the Pipelines from the left-side menu.

  2. Open the pipeline from the list.

  3. Click outside the pipeline (clicking on a node does not display the Pipeline Cycles.)

  4. Click Rerun Cycles.

  5. Select the cycles from the list. Use the cycle status filter, if required.

  6. Click Rerun # Cycles. A Rerun cycle popup is displayed. Before confirming, choose from the following execution cycle types:

  • Resume Cycle: Continues the selected executions from the current node onward.
  • Restart Cycle: Restarts the entire execution cycle from the beginning.
  1. Click Confirm.

View Executions in a Cycle

In the Pipeline Cycle list, click on the number of the executions column to drill-in and see each execution. This allows you to browse the executions, see the highlighted node on the Pipeline canvas (which enables you to monitor the item’s progress in the pipeline), and see the execution details (input, output, item with item link, execution time, and duration).

  • Use the keyboard Up ⬆️ and Down ⬇️ arrows to browse between the executions and trace the item’s progress over the canvas.

  • Select an execution from the list to see its details, including the function used in the execution, the input, and the output.

Rerun a Failed Execution in a Pipeline Cycle

You can rerun only failed executions.

To rerun the failed execution, click on the play icon next to the execution. A confirmation message is displayed, and the execution status will change to Rerun.

Pausing a Pipeline with Running Cycles

Once pausing a pipeline with any pending/running cycles, the cycles' status will be updated to "Paused" and the cycles will stop running. When resuming the pipeline, a dialog will open offering two options:

  1. Resume all available cycles (pending/in-progress)

  2. Abort all available cycles (pending/in-progress) - cycles will get "Terminate" status

    Aborted Cycles

    At the moment, aborted cycles are filtered out automatically from the cycles list in the side panel (can be displayed by filtering cycles with "Terminated" status) and are excluded from the pipeline "statistics bar" counters as well.

If the pipeline was modified while paused and you choose to resume it, the resumed cycles will continue to run according to the new pipeline composition.

Pause Action Limitations

The "pause" action may not immediately halt all pipeline activity.
Node executions that have already started running will not be affected, and will only be paused once the current execution is completed. Additionally, it is possible that cycles that are waiting in a node queue at the pause time will be still executed on the node, before being paused.


Node Executions

In a Dataloop pipeline, each task (called a node) performs a specific function—like running a model, transforming data, or exporting results. A node execution refers to one complete run of that node during a pipeline cycle.

For example, if your pipeline runs daily, then each node in that pipeline will have a new execution each day.

🔍 Viewing Node Execution History

You can track the status of each node's executions to monitor progress or debug issues.

  • Click on any node in the pipeline diagram.
  • A right-hand panel will open showing a list of executions related to that node.
  • Executions are listed by their status: success, failed, pending, etc.

Node Executions Status

You can filter the list of executions based on their status. Available status are:

📄 Viewing Node Execution Details

When you click on a specific execution, you’ll see detailed information, including:

  • Info – Basic metadata like timestamps and execution ID.
  • Input – The data that was sent into the node.
  • Output – The result produced by the node.
  • Error – Any error messages, if the execution failed.

If needed, you can rerun executions to improve results or address issues.

Rerun Node Executions

If something went wrong, or you want to improve the output, you can rerun that specific execution directly from the panel. This is helpful when fixing configuration errors or updating models.

Rerun a Failed Execution

You can rerun a particular failed execution from the list. Click on the Rerun Execution icon next to the failed execution. After clicking, the Failed status will be changed to Rerun.

Rerun All Executions

  1. Select the Pipelines from the left-side menu.

  2. Open the pipeline from the list.

  3. Select a node from the pipeline. A right-side panel will be displayed.

  4. Click Rerun Executions.

  5. Select the executions from the list. Use the execution status filter, if required.

  6. Click Rerun # Executions. A Rerun Cycles of the Selected Executions popup is displayed. Before confirming, choose from the following execution cycle types:

  • Resume Cycle: Continues the selected executions from the current node onward.
  • Restart Cycle: Restarts the entire execution cycle from the beginning.
  1. Click Confirm.

⏸️ Pausing Pipelines with Active Event triggers

You are provided with the option to keep the pipeline event triggers active when the pipeline is paused, so you won't loose events while editing the pipeline. Please click here to read more.


❗Errors to Scale Up a Service

Errors occurring in the service restart loop can impact Code/FaaS nodes, potentially hindering the scaling up of the node service. The possible causes can be issues with defined requirements or the specified Docker image. When such errors occur, an indicator will be displayed on the corresponding Pipeline node, along with details about the affected service in the services page of the CloudOps.

  • Pipeline service error indication: The error message below appears on the pipeline nodes page during the execution of a pipeline if its associated service has failed.
  • Pipeline node error when executing a pipeline with an inactive service: The below error message appears during the execution of a pipeline if its associated service is inactive.

❗Overcoming Execution Errors

After resolving the root cause for any problem that resulted in failing to execute items (e.g. code problems in packages, insufficient compute resources), you can rerun the execution of failed items

  • From the side-panel – select the node, switch to the Executions tab, select the Failed filter option, hover over an item and, click the Rerun button.
  • From the CloudOps > Executions page – in the search field, filter by pipeline and by execution.status: failed, and select Rerun All.

Node Actions

Once you select a node and clicking on the Actions feature in the right-side panel allows you to carry out various tasks that are specific to the function you have selected. The following table lists the actions available on pipelines nodes.

Node Specific Actions

The list of actions displayed on each Node are according to the Node's Type.


➕ Add a New Model Node

Dataloop allows the installations for AI/ML models by allowing them to be hosted and executed on:

  • Dataloop's Managed Compute (internal infrastructure): The Models run on the Dataloop's Compute.

  • External Compute Providers (e.g., OpenAI, Azure, GCP, IBM, NVIDIA) via API Service Integration: The Models run on external provider's compute, which requires secret credentials to complete the installation.

  1. Open the Pipelines page.

  2. Click on the pipeline from the list. The pipeline page is displayed.

  3. In the Node Library on the left-side panel, click on the ➕ icon. The Add Pipeline Nodes page of the Marketplace is displayed.

  4. In the Models tab, select the required node from the list. By default, the model nodes are sorted by Installation Status.

  5. Click Add Node from the right-side panel. The Install Model Application popup is displayed.

  6. Select the required model version from the list and click Proceed. A popup is displayed to select the API key to complete the integration process if needed.

  7. Select an API Key, Secret or an Integration, as required. If not available:

    1. If there is no secret, click Add New Secret and follow the steps.
    2. To set the integration later, click Set Up Later.
  8. Click Install Model. A confirmation message is displayed, and the model node will be added to the Node Library.

    1. To view application details, click on the View Application link. It navigates you to the CloudOps page.
    2. To view the model, go to the Models page.

➕ Add an Application Node

Dataloop allows the installations of applications by allowing them to be hosted and executed on:

  • Dataloop's Managed Compute (internal infrastructure): The applications run on the Dataloop's Compute.

  • External Compute Providers (e.g., OpenAI, Azure, GCP, IBM, NVIDIA) via API Service Integration: The applications run on external provider's compute, which requires secret credentials to complete the installation.

  1. Open the Pipelines page.

  2. Click on the pipeline from the list. The pipeline page is displayed.

  3. In the Node Library on the left-side panel, click on the ➕ icon. The Add Pipeline Nodes page of the Marketplace is displayed.

  4. Select the Applications tab.

  5. Select the required application node from the list. By default, the application nodes are sorted by Installation Status.

  6. Click Add Node from the right-side panel.

  7. Click Proceed. A popup is displayed to select the secret key, API key, or integration to complete the integration process, if needed.

    1. If there is no secret key, click Add New Secret and follow the steps.
    2. To set the integration later, click Set Up Later.
  8. Click Install Application. A confirmation message is displayed, and the application node will be added to the Node Library.

  9. To view application details, click on the View Application link. It navigates you to the CloudOps page.


Install a Trainable Model

When you see there are no trainable models available in the Train model node, follow the instructions to install them:

  1. Click on the Install Foundation Model. The Select a Foundation Model page is displayed.
  2. Select a trainable model, and click Install. The Install Model Application popup is displayed.
  1. Select the required model variation from the list and click Install Model. A successful message is displayed, click View Model to view the newly installed model in the Model Management page. Also, the newly installed model will be selected in the node configuration section.

Install a Model

When you see there are no models available in the Evaluate Model Node, follow the instructions to install them:

  1. Click on the Install Foundation Model. The Select a Foundation Model page is displayed.
  2. Select a model and click Install. The Install Model Application popup is displayed.
  1. Select the required model variation from the list and click Install Model. A successful message is displayed and the newly installed model is available in the Model Management page.

Once installed, it will be available for selection for the evaluation.


Set Pipeline Triggers

Dataloop allows you to set triggers on pipeline nodes.


🚮 Remove a Model Node from the Node Library

Important

The node will be removed only from the Node Library but will remain available in the Marketplace under the Models tab for re-adding.

  1. Open the Pipelines page.
  2. Click on the pipeline from the list. The pipeline page is displayed.
  3. In the Node Library on the left-side panel, identify the node from the list and hover over.
  1. Click on the X icon. A confirmation message is displayed.
  2. Click Remove. The removal success confirmation message is displayed.

Re-add a pipeline node to the node library

Follow these steps to restore a removed node to the Node Library:

  1. Open the Pipelines page.
  2. Click on the pipeline from the list. The pipeline page is displayed.
  3. In the Node Library on the left-side panel, click on the + icon Add new nodes. The Models tab of the marketplace is displayed.
  4. Identify the model that you have to re-add to the library and click Add Model -> Restore Pipeline Node. A confirmation message is displayed.
Re-adding pipeline nodes scenarios
  • For uninstalled applications: The button will display Add Node.
  • For installed applications:
    • If a node shortcut exists, the button will display Add Model.
    • If there is no node shortcut (removed), the button will display Add Model with a dropdown offering:
      • Add Model
      • Restore Pipeline Node

📋 Copy Pipeline ID

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click Copy Pipeline ID from the list. The pipeline ID will be copied to the clipboard.

✏️ Rename Pipelines

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click Rename Pipeline from the list.
  4. Make the update and click Confirm. A confirmation message is displayed.

👁️ View Pipeline's Execution Data

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click View Executions from the list. The CloudOps → Executions tab is displayed.

👁️ View Pipeline's Log Data

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click View Logs from the list. The CloudOps → Logs tab is displayed.

👁️ View Pipeline's Audit Logs

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click View Audit Logs from the list. The Audit Logs page is displayed.

🚮 Delete Pipelines

Warning

All the applications created by the Pipeline will be deleted.

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click Delete Pipeline from the list. A confirmation message is displayed.
  4. Click Yes. A confirmation message is displayed.


What's Next