1.3.8

Data Nodes

14 Aug 2025

Print
Dark
Light
PDF

Data Nodes

Updated On 14 Aug 2025

Print
Dark
Light
PDF

Article summary

Did you find this summary helpful?

Thank you for your feedback

Dataset Nodes

The Dataset node offers flexibility in data pipelines, allowing you to generate a new dataset or use an existing one for filtering or storage. It can be placed at the beginning, middle, or end of the pipeline to streamline data processing.

At the beginning: The Dataset node filters triggered items, ensuring only selected dataset items (and folders, if specified) are processed. However, these items won’t automatically trigger the pipeline; you must use an event trigger, manually execute via the Dataset Browser, or use the Dataloop SDK.
In the middle or end: The Dataset node clones processed items to the specified dataset or folder. If an item already exists at the destination, it is skipped.

When you click on a Dataset Node, its details, such as Configuration, Executions, Logs, and available Actions are shown on the right-side panel.

Config

Executions

Logs

Node Name: Provide a name for the dataset node. By default, the name of the node will be displayed.
Dataset: Select an existing dataset, or click Create Dataset to create a new dataset. A Dataset node can only have one output channel and one input channel.
Set Fixed Dataset or Set Variable: It allows you to set the selected dataset as a fixed dataset or set a Pipeline Variable (follow from the Step 3).
Folder (Optional): Select a folder within the selected dataset. This option will not be accessible if no dataset is selected.
Trigger Existing Dataset and Folder Data to the Pipeline:
- Enable this option to automatically load existing data into the pipeline dataset node upon activation, based on the chosen dataset, folder, and any DQL filter in the trigger.
- This option is only available when this node is the start node.
- Note: This is a one-time action and does not re-trigger after changes to the dataset, folder, or filters, or if the pipeline is paused and resumed.
Node Input: Input channels are set to be of type: item by default. Click Set Parameter to set input parameter for the dataset node. For more information, see the Node Inputs article.
Node Output: Output channels are set to be of type: item by default.
Trigger (Optional): You can set an Event Trigger on this node to start the pipeline run from this specific point. Since the Dataset node’s input is items, only the Event trigger type is supported.

When a node is selected, the Executions tab provides details about the execution of each item processed through that node.

Choose a specific execution from the list to access its detailed information.
Dataloop stores executions for a maximum of 90 days.

Use the filter to display executions based on its status.
Click Rerun Executions to rerun the selected executions. Before you confirm the rerun, you have two actions for the cycles:
- Resume Cycle: Resume the cycles of the selected executions starting from the current node onwards.
- Restart Cycle: Restart the cycles of the selected executions starting from the beginning.
View Execution details such as Information, Input, and Output.

The Logs tab provides a granular view of what happens behind the scenes when a node executes.

It showcases all log entries, including function and service level logs, application installation processes (from initialization to completion), generated during executions of the corresponding node within the selected time-frame.
Dataloop stores logs for a maximum of 14 days.
Click on the log to view log details in the bottom.

Utilize the following options to navigate through the logs:

Search: Enter free text to search through the logs.
Log Level Filter: Filter logs by selecting a level (critical, error, warning, info, debug) to find logs at that level and above.
Show Service Logs: Show/hide service-level logs, aiding in the debugging and resolution of service-level errors and issues.

Open the Node's Dataset

This option allows you to browse the Dataset that you have selected in the Dataset Node.

Open the Pipelines page and click on the pipeline from the list.
Select the Dataset node.
Click on the Action from the right-side panel.

Click Open Dataset from the list. The Dataset Browser page is displayed.

Copy the Service ID

Copy service ID from a pipeline node.

Open the Pipelines page and click on the pipeline from the list.
Select the node from the pipeline.
Click on the Action from the right-side panel.

Click Copy Service ID from the list.

Copy the Node ID

Copy node ID of a pipeline node

Open the Pipelines page and click on the pipeline from the list.
Select the node from the pipeline.
Click on the Action from the right-side panel.

Click Copy Node ID from the list.

View Log Details

It allows you to view the Logs page of the Application Service from a pipeline node.

Open the Pipelines page and click on the pipeline from the list.
Select the node from the pipeline.
Click on the Action from the right-side panel.

Click View Logs from the list. The Logs tab of the CloudOps page is displayed.

View Execution Details

It allows you to view the Executions page of the Application Service.

Open the Pipelines page and click on the pipeline from the list.
Select the node from the pipeline.
Click on the Action from the right-side panel.

Click View Executions from the list. The Executions tab of the CloudOps Service page is displayed.

View Analytics Details

It allows you to view the Monitoring tab of the Application Service.

Open the Pipelines page and click on the pipeline from the list.
Select the node from the pipeline.
Click on the Action from the right-side panel.

Click View Analytics from the list. The Monitoring tab of the Application Service page is displayed.

Update Variable Nodes

The Update Variable node allows you to manage the pipeline variables and update their values dynamically in real-time during pipeline execution.

You can select the required variables from the dropdown list.
The node input/output will be updated automatically according to your selection.
When the Update Variable node executes, the node delivered input will be set as the new value for the variable.

When you click on an Update Variable node, its details, such as Configuration, Executions, Logs, Instances, and available Actions are shown on the right-side panel.

Config

Executions

Logs

Instances

Node Name: By default, Update Variable is displayed as name. Make changes accordingly.
Variables: Select required variables from the dropdown list, or create a new one.
Node Input: It will be set automatically after selecting a variable. Click Set Parameter to set input parameter for the dataset node. For more information, see the Node Inputs article.
Node Output: It will be set automatically after selecting a variable.
Trigger (Optional): An Event/Cron Triggers can be set on this node, enabling you to initiate the pipeline run from this specific point.

When a node is selected, the Executions tab provides details about the execution of each item processed through that node.

Choose a specific execution from the list to access its detailed information.
Dataloop stores executions for a maximum of 90 days.

Use the filter to display executions based on its status.
Click Rerun Executions to rerun the selected executions. Before you confirm the rerun, you have two actions for the cycles:
- Resume Cycle: Resume the cycles of the selected executions starting from the current node onwards.
- Restart Cycle: Restart the cycles of the selected executions starting from the beginning.
View Execution details such as Information, Input, and Output.

The Logs tab provides a granular view of what happens behind the scenes when a node executes.

It showcases all log entries, including function and service level logs, application installation processes (from initialization to completion), generated during executions of the corresponding node within the selected time-frame.
Dataloop stores logs for a maximum of 14 days.
Click on the log to view log details in the bottom.

Utilize the following options to navigate through the logs:

Search: Enter free text to search through the logs.
Log Level Filter: Filter logs by selecting a level (critical, error, warning, info, debug) to find logs at that level and above.
Show Service Logs: Show/hide service-level logs, aiding in the debugging and resolution of service-level errors and issues.

The Instance tab displays all available instances (replicas of servicesor machines) of the node service, along with their current status—such as 🟡 Initializing, 🟢 Running, or 🔴 Failed.

It also indicates the number of restarts for each instance, which is useful for identifying services that are failing to activate and continuously restarting.
Additionally, a monitoring icon is available for each instance. Clicking this icon redirects you to the Instances section of the Service page, where you can monitor detailed health metrics such as CPU and memory usage for each instance.

Update Variable Node’s Actions

Open the Service Page

Open the service page of a pipeline node where you can see Monitoring, Executions, Logs, Triggers, and Instances information of the service. It will be available once you set a model.

Open the Pipelines page and click on the pipeline from the list.
Select the node from the pipeline.
Click on the Action from the right-side panel.

Click Open Service Page from the list. The Monitoring tab of the service page is displayed.