Manage Pipelines
  • 14 May 2025
  • Dark
    Light
  • PDF

Manage Pipelines

  • Dark
    Light
  • PDF

Article summary

This article provides a detailed guide on managing pipelines within the Dataloop platform. It covers essential aspects such as add secrets, reset a pipeline, etc.


Filtering Data Between Nodes

Hover over a connection between nodes and click the + sign to add a filter. Adding a filter means that only data assets (i.e., items or annotations) that comply with the filter condition will be passed onto the next pipeline node.

Filters can be selected from previously selected filters (saved in the Dataset browser) or written directly into the DQL editor.

For example, the following filter will only pass items whose "size" attribute in the "system" in the "metadata" is less than 1MB:

  • Notice that unlike the DQL editor in the Dataset Browser, in the pipeline DQL editor you do not need to include the attributes you filter by within a "filter" property.
  • The DQL property JOIN is not supported in the pipeline DQL filter.


Remove a filter by hovering over a connection and clicking the X icon (this will sever the connection between the output/input points, and you will need to reconnect the nodes). Alternatively, you remove the filter by setting the filter to an empty JSON { }.


Reset Pipelines

Resetting a pipeline clears all counters, as well as any pending items and cycles, without deleting existing logs or executions.

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Pipeline Name. The pipeline page is displayed.
  3. Click on the Actions and select Reset pipeline from the list.

Add Secrets to a Pipeline

Users can configure storage integrations with popular cloud storage services and set up private-registry integrations for more secure and efficient container management. The available types are Key-Value, AWS S3, Google Cloud Storage, Azure Blob, and Google Container Registry.

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click Manage Secrets from the list.
  4. Select an existing secrets and integrations from the list, or
  5. Click the + icon. The New Integration popup is displayed.
    1. Enter a unique name for your secret.
    2. Enter the secret value that you created at the Data Governance > Secrets page.
    3. Click Create.
  6. Click Confirm.

For more information about Dataloop secret, see the Secrets.


Pipeline Cycles

A Pipeline cycle refers to all node executions performed on a single pipeline run (usually over a specific item); the executions are listed in the order in which they occurred. Since some items may be routed differently in the pipeline based on filters and user actions, each cycle may have a different number of executions.

Select a cycle from the list to see its details, including first (node) execution time, last (node) execution time,
Select an execution from the list to see its details, including the function used in the execution, the input, and the output.


Clicking the play button will show the item's progress in the pipeline, highlighting the nodes involved in processing the item.

In the Pipeline Cycle list, click on the number in the Executions column to drill-in and see each execution. This allows you to browse the executions, see the highlighted node on the Pipeline canvas (which enables you to monitor the item’s progress in the pipeline), and see the execution details (input, output, item with item link, execution time, and duration).

Use the Up and Down arrows to browse between the executions and trace the item’s progress over the canvas.


Rerun Pipeline Cycles

Users can rerun pipeline cycles by selecting executions for rerun starting points or entire cycles for rerun to avoid regenerating the cycles again in the case of failure. Rerunning the cycle in the pipeline removes the item status from a task, so it will be reinserted into an assignment.

Rerun Execution Cycles

When a node is selected, the Executions tab displays detailed information about each item processed through that node. If needed, you can rerun executions to improve results or address issues.

Click Rerun Executions to initiate the rerun process. Before confirming, choose from the following execution cycle types:

  • Resume Cycle: Continues the selected executions from the current node onward.
  • Restart Cycle: Restarts the entire execution cycle from the beginning.

Pausing a Pipeline with Running Cycles

Once pausing a pipeline with any pending/running cycles, the cycles' status will be updated to "Paused" and the cycles will stop running. When resuming the pipeline, a dialog will open offering two options:

  1. Resume all available cycles (pending/in-progress)

  2. Abort all available cycles (pending/in-progress) - cycles will get "Terminate" status

    Aborted Cycles

    At the moment, aborted cycles are filtered out automatically from the cycles list in the side panel (can be displayed by filtering cycles with "Terminated" status) and are excluded from the pipeline "statistics bar" counters as well.

If the pipeline was modified while paused and you choose to resume it, the resumed cycles will continue to run according to the new pipeline composition.

Pause Action Limitations

The "pause" action may not immediately halt all pipeline activity.
Node executions that have already started running will not be affected, and will only be paused once the current execution is completed. Additionally, it is possible that cycles that are waiting in a node queue at the pause time will be still executed on the node, before being paused.


Pausing Pipelines with Active Event triggers

You are provided with the option to keep the pipeline event triggers active when the pipeline is paused, so you won't loose events while editing the pipeline. Please click here to read more.


Errors to Scale Up a Service

Errors occurring in the service restart loop can impact Code/FaaS nodes, potentially hindering the scaling up of the node service. The possible causes can be issues with defined requirements or the specified Docker image. When such errors occur, an indicator will be displayed on the corresponding Pipeline node, along with details about the affected service in the services page of the CloudOps.

  • Pipeline service error indication: The error message below appears on the pipeline nodes page during the execution of a pipeline if its associated service has failed.
  • Pipeline node error when executing a pipeline with an inactive service: The below error message appears during the execution of a pipeline if its associated service is inactive.

Overcoming Execution Errors

After resolving the root cause for any problem that resulted in failing to execute items (e.g. code problems in packages, insufficient compute resources), you can rerun the execution of failed items

  • From the side-panel – select the node, switch to the Executions tab, select the Failed filter option, hover over an item and, click the Rerun button.
  • From the CloudOps > Executions page – in the search field, filter by pipeline and by execution.status: failed, and select Rerun All.

Node Actions

Once you select a node and clicking on the Actions feature in the right-side panel allows you to carry out various tasks that are specific to the function you have selected. The following table lists the actions available on pipelines nodes.

Node Specific Actions

The list of actions displayed on each Node are according to the Node's Type.


Add a New Model Node

Dataloop allows the installations for AI/ML models by allowing them to be hosted and executed on:

  • Dataloop's Managed Compute (internal infrastructure): The Models run on the Dataloop's Compute.

  • External Compute Providers (e.g., OpenAI, Azure, GCP, IBM, NVIDIA) via API Service Integration: The Models run on external provider's compute, which requires secret credentials to complete the installation.

  1. Open the Pipelines page.

  2. Click on the pipeline from the list. The pipeline page is displayed.

  3. In the Node Library on the left-side panel, click on the ➕ icon. The Add Pipeline Nodes page of the Marketplace is displayed.

  4. In the Models tab, select the required node from the list. By default, the model nodes are sorted by Installation Status.

  5. Click Add Node from the right-side panel. The Install Model Application popup is displayed.

  6. Select the required model version from the list and click Proceed. A popup is displayed to select the API key to complete the integration process if needed.

  7. Select an API Key, Secret or an Integration, as required. If not available:

    1. If there is no secret, click Add New Secret and follow the steps.
    2. To set the integration later, click Set Up Later.
  8. Click Install Model. A confirmation message is displayed, and the model node will be added to the Node Library.

    1. To view application details, click on the View Application link. It navigates you to the CloudOps page.
    2. To view the model, go to the Models page.

Add an Application Node

Dataloop allows the installations of applications by allowing them to be hosted and executed on:

  • Dataloop's Managed Compute (internal infrastructure): The applications run on the Dataloop's Compute.

  • External Compute Providers (e.g., OpenAI, Azure, GCP, IBM, NVIDIA) via API Service Integration: The applications run on external provider's compute, which requires secret credentials to complete the installation.

  1. Open the Pipelines page.

  2. Click on the pipeline from the list. The pipeline page is displayed.

  3. In the Node Library on the left-side panel, click on the ➕ icon. The Add Pipeline Nodes page of the Marketplace is displayed.

  4. Select the Applications tab.

  5. Select the required application node from the list. By default, the application nodes are sorted by Installation Status.

  6. Click Add Node from the right-side panel.

  7. Click Proceed. A popup is displayed to select the secret key, API key, or integration to complete the integration process, if needed.

    1. If there is no secret key, click Add New Secret and follow the steps.
    2. To set the integration later, click Set Up Later.
  8. Click Install Application. A confirmation message is displayed, and the application node will be added to the Node Library.

  9. To view application details, click on the View Application link. It navigates you to the CloudOps page.


Install a Trainable Model

When you see there are no trainable models available in the Train model node, follow the instructions to install them:

  1. Click on the Install Foundation Model. The Select a Foundation Model page is displayed.
  2. Select a trainable model, and click Install. The Install Model Application popup is displayed.
  1. Select the required model variation from the list and click Install Model. A successful message is displayed, click View Model to view the newly installed model in the Model Management page. Also, the newly installed model will be selected in the node configuration section.

Install a Model

When you see there are no models available in the Evaluate Model Node, follow the instructions to install them:

  1. Click on the Install Foundation Model. The Select a Foundation Model page is displayed.
  2. Select a model and click Install. The Install Model Application popup is displayed.
  1. Select the required model variation from the list and click Install Model. A successful message is displayed and the newly installed model is available in the Model Management page.

Once installed, it will be available for selection for the evaluation.


Set Pipeline Triggers

Dataloop allows you to set triggers on pipeline nodes.


Remove a Model Node from the Node Library

Important

The node will be removed only from the Node Library but will remain available in the Marketplace under the Models tab for re-adding.

  1. Open the Pipelines page.
  2. Click on the pipeline from the list. The pipeline page is displayed.
  3. In the Node Library on the left-side panel, identify the node from the list and hover over.
  1. Click on the X icon. A confirmation message is displayed.
  2. Click Remove. The removal success confirmation message is displayed.

Re-add a pipeline node to the node library

Follow these steps to restore a removed node to the Node Library:

  1. Open the Pipelines page.
  2. Click on the pipeline from the list. The pipeline page is displayed.
  3. In the Node Library on the left-side panel, click on the + icon Add new nodes. The Models tab of the marketplace is displayed.
  4. Identify the model that you have to re-add to the library and click Add Model -> Restore Pipeline Node. A confirmation message is displayed.
Re-adding pipeline nodes scenarios
  • For uninstalled applications: The button will display Add Node.
  • For installed applications:
    • If a node shortcut exists, the button will display Add Model.
    • If there is no node shortcut (removed), the button will display Add Model with a dropdown offering:
      • Add Model
      • Restore Pipeline Node

Copy Pipeline ID

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click Copy Pipeline ID from the list. The pipeline ID will be copied to the clipboard.

Rename Pipelines

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click Rename Pipeline from the list.
  4. Make the update and click Confirm. A confirmation message is displayed.

View Pipeline's Execution Data

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click View Executions from the list. The CloudOps → Executions tab is displayed.

View Pipeline's Log Data

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click View Logs from the list. The CloudOps → Logs tab is displayed.

View Pipeline's Audit Logs

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click View Audit Logs from the list. The Audit Logs page is displayed.

Delete Pipelines

Warning

All the applications created by the Pipeline will be deleted.

  1. Go to the Pipelines from the left-side navigation panel.
  2. Identify the Pipeline and click on the Action (three dots) icon.
  3. Click Delete Pipeline from the list. A confirmation message is displayed.
  4. Click Yes. A confirmation message is displayed.


What's Next