Create Labeling Tasks
  • 21 Apr 2025
  • Dark
    Light
  • PDF

Create Labeling Tasks

  • Dark
    Light
  • PDF

Article summary

A Labeling Task in Dataloop is a defined unit of work that assigns a specific set of data items—such as images, videos, audio, or documents—to one or more annotators for labeling.

Create a labeling task

To create a labeling task, follow the steps below:

Information

Once each step is completed, it will appear in green with a checkmark next to it on the step list. A red exclamation mark is displayed if it is incomplete.

  1. Open the Labeling from the left-side menu.
  2. Click Create Task. The task type section popup is displayed.
  3. Select the Labeling from the popup.
  1. Click Continue. Complete the following sections to complete the task creation.

1. General

In the General section, enter or select the required details:

  1. Task Name: Enter a name for your new task. By default, the project name + (total number of tasks + 1) is displayed.
    For example, if the project name is abc and the total number of tasks you have already is 5, then the new task name is abc-6.
  2. Owner: By default, the current user's email ID is displayed. Click on it to select a different owner from the list.
  3. Priority: Select a priority from the list. By default, Medium is selected.
  4. Completion Due Date (Optional): Select a task's due date from the calendar.
  5. Click Next: Data Source.

2. Data source

In the Data Source section, enter or select the required details:

  1. Select Dataset: By default, the selected dataset name is displayed, click on it to select a different dataset. The Dataset field is disabled, if you select any particular item(s) in the Dataset.

    Note:

    You cannot create a task with a dataset that contains items 80,000 or above. To use this dataset, sampling must be done or replaced with another dataset. You can view the number of total items on the top-right side of the page.

  2. Filters (Optional) : Refine data selection by selecting specific folders, using DQL filters, or sub-sampling (randomly and equally distributed). The Folder or DQL field is Active only if you do not select any items in the Dataset.

    1. Folders: Select a folder from the dataset, if available.
    2. Selected Filters / Saved DQL Query: Select a filter or saved DQL query from the list, if available.
    3. Data Sampling: Enter the Percentage or Number of Items for the task. Data sampling does not give an exact number of items.
      1. Percentage: The option selects the items randomly. For example, if the percentage is 100% for four items, then 75% is for three items (It can be 1/4, 3/4, or 4/4) from the selected dataset. *
      2. Number of Items: The allows you to select the items sequentially from the start of the dataset, not randomly.
    4. Collections: Choose a collection from the list to filter and display items within the selected collection.
  3. (Optional) WebM Conversion: By default, Enforce WEBM conversion of video items for frame-accurate annotations is selected.

  4. Click Next: Instructions.

3. Instructions

In the Instructions section, enter or select the required details. The number of Labels and Attributes is displayed on the top-right side of the page:

  1. Recipe: By default, the default recipe is displayed. Select a recipe from the list, if needed.
  2. Labeling Instructions (.pdf): The labeling instruction document is displayed, if available. Go to the Recipe section to upload a PDF instruction. You can select the page range accordingly.
  3. Click Next: Statuses. The Statuses section is displayed.

4. Statuses

In the Statuses section, enter or select the required details

  1. By default, the Completed status is selected. Click Add New Status to add a new status.
  2. Click Next: Assignments.

5. Assignments

In the Assignments section, enter or select the required details:

Impact on Quality Tasks and Assignees

Switching the allocation method from Distribution to Pulling will disable all Quality tasks (such as consensus, honeypot, and qualification). Additionally, any existing task assignees will be cleared. You'll be prompted with confirmation dialogs to review and approve these changes before they take effect.

  1. Allocation Method: Select one of the following allocation methods:
    1. Pulling: The pulling distribution method means that annotators only pull a batch of items at a time and the maximum number of items in an assignment. You can make changes in the following fields if required: Pulling batch size (items) and Max items in an assignment.
    2. Distribution: The distribution allocation method means that the items will be distributed in advance among users, equally or based on a custom percentage. The Auto Distribution option distributes the task equally among the users. By default, it is checked.
  2. Available Users: Search for or select users from the list, and click on the Forward arrow icon to add to the Assigned users list.
  3. Assigned Users:
    1. Search for or view the assigned users from the list. The allocation percentage is equally distributed if you select Auto Distribution.
    2. Select and click the Backward arrow icon to remove them from the Assigned Distribution list.
Inactive users

Inactive users are grayed out and disabled for redistribution, and available for reassignment.

  1. Click Create Task. The newly created labeling task will be available in the tasks list.

Use pipelines to automate a labeling task

Dataloop allows you to create a Labeling Task Workflow Using a Pipeline Template to streamline and manage the labeling process efficiently.

To create the workflow, follow the steps:

  1. Open the Labeling from the left-side menu.
  2. Click Create Task. The task type section popup is displayed.
  1. Select the From Pipeline Template from the list, and click Continue.
  2. Select a Workflow widget and verify the details displayed on the right-side panel. A preview of the template with the available nodes is displayed.
  3. Click Create Workflow. A new pipeline workflow page is displayed, and you start building your workflow by using various pipeline nodes.