Annotation and QA tasks
  • 18 Jul 2024
  • Dark
    Light
  • PDF

Annotation and QA tasks

  • Dark
    Light
  • PDF

Article summary

Annotation Task

A task of labeling contents, such as text, audio, images, video, etc. to efficiently organize the dataset. The machine learning models are used to recognize this annotation and make predictions.

Create an Annotation Task

To create an annotation task, you can use one of the following options to create an annotation task:

  • Labeling: You can click on the Labeling from the left-side menu, and click Create Task.
  • Data: Open the Data from the left side panel.
Information

When you create a task from the Dataset browser, it includes:

  • Specifically selected items (CTRL + Left mouse button), if such are selected.
  • All items in the search query results. For example, querying based on user metadata can allow the creation of tasks in a project-specific context.
  • All items in the Dataset - if there's no active query or selected items.
  • Items from a specific folder

Once each step is completed, it will appear in green with a checkmark next to it on the step list. A red exclamation mark is displayed if it is incomplete.

  1. Open the Data from the left side panel.
  2. Select and open the dataset that you need to create an annotation task.
  3. Click Dataset Actions.
  4. Select Labeling Tasks > Create New Task. Enter required data as given in the following sections. Select a particular section from the left-side menu, if required.
    :::

Section 1. General

  1. Enter or select the required details in the General section:
    1. Task Name: Enter a name for the new task. By default, the project name + (total number of tasks + 1) is displayed.
      For example, if the project name is abc and the total number of tasks you have already is 5, then the new task name is abc-6.
    2. Task Type: Select the Labeling type.
    3. Owner: By default, the current user's email ID is displayed. Click on it to select a different owner from the list.
    4. Status: By default, the To Do status is displayed, and it cannot be changed.
    5. Priority: Select a priority from the list. By default, Medium is selected.
    6. (Optional) Completion Due Date: Select a task's due date from the calendar.
  2. Click Next: Data Source.

Section 2. Data Source

  1. Enter or select the required details in the Data Source section.

    1. Select Dataset: By default, the selected dataset name is displayed, click on it to select a different dataset. The Dataset field is disabled, if you select any particular item(s) in the Dataset.

      Note:

      You cannot create a task with a dataset that contains items 80,000 or above. To use this dataset, sampling must be done or replaced with another dataset. You can view the number of total items on the top-right side of the page.

    2. (Optional) Filters: Refine data selection by selecting specific folders, using DQL filters, or sub-sampling (randomly and equally distributed). The Folder or DQL field is Active only if you do not select any items in the Dataset.

      1. Folders: Select a folder from the dataset, if available.
      2. Selected Filters / Saved DQL Query: Select a filter or saved DQL query from the list, if available.
      3. Data Sampling: Enter the Percentage or Number of Items for the task. Data sampling does not give an exact number of items.
        1. Percentage: The option selects the items randomly. For example, if the percentage is 100% for four items, then 75% is for three items (It can be 1/4, 3/4, or 4/4) from the selected dataset. *
        2. Number of Items: The allows you to select the items sequentially from the start of the dataset, not randomly.
  2. (Optional) WebM Conversion: By default, Enforce WEBM conversion of video items for frame-accurate annotations is selected.

  3. Click Next: Instructions.

Section 3. Instructions

  1. Enter or select the required details in the Instructions section. The number of Labels and Attributes is displayed on the top-right side of the page.

    1. Recipe: By default, the default recipe is displayed. Select a recipe from the list, if needed.
    2. Labeling Instructions (.pdf): The labeling instruction document is displayed, if available. Go to the Recipe section to upload a PDF instruction. You can select the page range accordingly.
  2. Click Next: Statuses. The Statuses section is displayed.

Section 4. Statuses

  1. By default, the Completed status is selected. Click Add New Status to add a new status.
  2. Click Next: Assignments.

Section 5. Assignments

  1. Enter or select the required details in the Assignments section.
    1. Allocation Method: Select one of the following allocation methods:
      1. Pulling: The pulling distribution method means that annotators only pull a batch of items at a time and the maximum number of items in an assignment. You can make changes in the following fields if required: Pulling batch size (items) and Max items in an assignment.
      2. Distribution: The distribution allocation method means that the items will be distributed in advance among users, equally or based on a custom percentage. The Auto Distribution option distributes the task equally among the users. By default, it is checked.
    2. Available Users: Search for or select users from the list, and click the Forward arrow icon to add them to the Assigned users list.
    3. Assigned Users:
      1. Search for or view the assigned users from the list. The allocation percentage is equally distributed if you select Auto Distribution.
      2. Select and click the Backward arrow icon to remove them from the Assigned Distribution list.
Inactive users

Inactive users are grayed out and disabled for redistribution, and available for reassignment.

  1. Click Next: Quality.

Section 6. Quality (Optional)

Not available for the Pulling Allocation method

The Quality task section is Not available for the Pulling Allocation method.

Enable advanced quality monitoring options to ensure data quality and review performance.

  1. Select the quality task type and proceed to customize its properties. By default, None is selected. Select the following types as needed:

    1. Consensus: The Consensus task is to create replicas of the items for simultaneous work by multiple annotators and generate majority-vote datasets. For more information, see Consensus.
    Not available for Pipeline tasks

    Qualification and Honeypot are not available for Pipeline tasks.

    b. Qualification: The Qualification task is used to create a dataset from multiple annotators. For more information, see Qualification.
    c. Honeypot: The Honeypot task is used to create a dataset from multiple annotators. For more information, see Honeypot.

  2. Click Create Task. A confirmation message is displayed.

Edit an Annotation Task

  1. Open the Labeling page from the left-side menu.
  2. Identify the annotation task that is to be edited, and click on the 3-dots icon.
  3. Select Edit Task from the list. The Edit Task page will be displayed, indicating whether it originated from Pipeline or Workflow through the respective tag.
Workflow and Pipeline Tasks
  • You have the ability to modify tasks generated through Workflow (Labeling > Tasks) or Pipeline.
  • You can edit pipeline tasks even while the pipeline is running, and there is no need to pause the pipeline for editing.
  • Editing a pipeline task will result in the corresponding update within the pipeline.
  1. Select the required section and make the changes. For more information about each section, see Create an Annotation Task. You can modify only the following fields:
    1. General:
      1. Task Name
      2. Owner
      3. Priority
      4. Completion Due Date
    2. Instructions: Recipe
    3. Assignments:
      1. Edit the Item Workload
      2. Reassign
  2. Click Save Changes.

Add Items to an Existing Annotation Task

Before adding items to an existing task, you may select the items you wish to add by clicking on them (CTRL+click to select multiple items). If you do not select any items, you can choose to filter the items with a DQL query or add all items that are not already included in the task.

  1. Open the Data page.
  2. Identify and open the Dataset.
  3. Select the required items from the dataset.
  4. Click Dataset Actions.
  5. Select Labeling Tasks > Add to an existing task from the list.
  6. In the Select Task section, select the task to which you need to add items.
  7. Click Next: Data Source.
  8. In the Data Source section, edit the Filters (Optional) details.
  9. Click Next: Assignments.
  10. In the Assignments section, add contributors from the Available Contributors list to the Assigned Contributors list.
  11. Click Add Items.
Add items to pipeline task

Adding items to an existing pipeline task using the option 'Add to an existing task' will result in adding the items to the task, however please be aware that the pipeline won't be triggered. To trigger the pipeline with new items, please read here.


QA Task

The purpose of the QA task is to increase the quality of annotations by reviewing annotation work and triggering problematic ones for correction by their original creator.

The task of reviewing annotation work is completed as a QA task. It has the option to flag annotations as having an 'issue' and send them to the original annotator for correction.

You can create QA tasks based on the following two scenarios:

  • A QA Task from an Annotation task: To validate annotations created by assignees.
  • A Standalone QA Task: To validate annotations that are uploaded to the platform, for example, Annotations created by your model.

On the tasks page, QA Tasks are linked to their respective annotation tasks. Click the "+" icon next to an annotation task to see all QA tasks related to it.

To learn about the QA process, see QA Process.

Create a QA Task from an Annotation Task

Information:

When you create a task from the Dataset browser, it includes:

  • Specifically selected items (CTRL + Left mouse button), if such are selected.
  • All items in the search query results. For example, querying based on user metadata can allow the creation of tasks in a project-specific context.
  • All items in the Dataset - if there's no active query or selected items.
  • Items from a specific folder.

Once each step is completed, it will appear in green with a checkmark next to it on the step list. A red exclamation mark is displayed, if it is incomplete.

To create a QA task, follow the instructions for each section:

  1. Open the Labeling page from the left-side menu.
  2. Click Create Task.

Section 1. General

  1. Enter or select the required details in the General section:
    1. Task Name: By default, your task name - QA is displayed. Modify, if needed.
    2. Task Type: Select QA as task type.
    3. Owner: By default, the current user's email ID is displayed. Click on it to select a different owner from the list.
    4. Status: By default, the To Do status is displayed, and it cannot be changed.
    5. Priority: Select a priority from the list. By default, Medium is selected.
    6. (Optional) Completion Due Date: Select a task's due date from the calendar.
  2. Click Next: Data Source.

Section 2. Data Source

  1. Enter or select the required details in the Data Source section.

    1. Select Dataset: By default, the dataset used to create the task is displayed.
    2. (Optional) Filters: Refine data selection by selecting specific folders, using DQL filters, or subsampling (randomly and equally distributed). The Folder or DQL field is Active only if you do not select any items in the Dataset.
      1. Folders: Select a folder from the dataset.
      2. Selected Filters / Saved DQL Query: Select a filter or saved DQL query from the list.
      3. Data Sampling: Enter the Percentage or Number of Items for the task. Data sampling does not give an exact number of items.
        1. Percentage: The option selects the items randomly. For example, if the percentage is 100% for four items, then 75% is for three items (It can be 1/4, 3/4, or 4/4) from the selected dataset. *
        2. Number of Items: The allows you to select the items sequentially from the start of the dataset, not randomly.
  2. Click Next: Instructions.

Section 3. Instructions

  1. Enter or select the required details in the Instructions section. The number of Labels and Attributes is displayed on the top-right side of the page.

    1. Recipe: By default, the default recipe is displayed. Select a recipe from the list, if needed.
    2. QA Instructions (.pdf): The QA Instruction document is displayed, if available. Go to the Recipe section to upload a PDF instruction.
  2. Click Next: Statuses. The Statuses section is displayed.

Section 4. Statuses

  1. By default, the Approved status is selected. Click Add New Status to add a new status.
  2. Click Next: Assignments.

Section 5. Assignments

  1. Enter or select the required details in the Assignments section.

    1. Allocation Method: Select one of the following allocation methods:
      1. Pulling: The pulling distribution method means that annotators only pull a batch of items at a time and the maximum number of items in an assignment. You can make changes in the following fields if required: Pulling batch size (items) and Max items in an assignment.
      2. Distribution: The distribution allocation method means that the items will be distributed in advance among users, equally or based on a custom percentage.
        1. The Auto Distribution option distributes the task equally among the users. By default, it is checked.
        2. The Show only unassigned users to any labeling task option allows existing users to complete their task.
    2. Available Users: Search for or select users from the list, and click the Forward arrow icon to add them to the Assigned Users list.
    3. Assigned Users:
      1. Search for or view the assigned users from the list. The allocation percentage is equally distributed if you select Auto Distribution.
      2. Select and click the Backward arrow icon to remove them from the Assigned Distribution list.
  2. Click Create Task.

Edit a QA Task

  1. Open the Labeling page from the left-side menu.
  2. Identify the QA task that is to be edited and click on the 3-dots icon.
  3. Select Edit Task from the list. The Edit Task page will be displayed, indicating whether it originated from Pipeline or Workflow through the respective tag.
Workflow and Pipeline Tasks
  • You have the ability to modify tasks generated through Workflow (Labeling > Tasks) or Pipeline.
  • You can edit pipeline tasks even while the pipeline is running, and there is no need to pause the pipeline for editing.
  • Editing a pipeline task will result in the corresponding update within the pipeline.
  1. Select the required section and make the changes. For more information, see the Create a QA task topic.
  2. Click Save.

Create a Labeling Task Workflow Using a Pipeline Template

Dataloop allows you to create a Labeling Task Workflow Using a Pipeline Template to streamline and manage the labeling process efficiently. To create the workflow, follow the steps:

  1. Click on the Labeling from the left-side menu.
  2. Click on the Dropdown arrow next to the Create Task.
  3. Select the Create Labeling Workflow from the list. The Select Labeling Workflow page is displayed.
  4. Select the Workflow widget and verify the details displayed on the right-side panel. A preview of the template with the available nodes is displayed.
  5. Click on Create Workflow. A new pipeline workflow page is displayed, and you start building your workflow by using various pipeline nodes.

Delete a Task

Pipeline Tasks Deletion

You cannot delete a task that is generated from the pipeline if it is currently in use within a respective pipeline. To proceed with this action, first remove the task node from the pipeline.

  1. Open the Labeling page from the left-side menu.
  2. Identify the Annotation or QA task that is to be deleted and click on the 3-dots icon.
  3. Select Delete Task from the list. A confirmation message is displayed.
  4. Click Yes. A confirmation message is displayed.

Archive Tasks

Once you delete tasks or datasets:

Deleting Tasks: When you delete a task, if the task has been part of analytics activities, instead of deleting, Dataloop platform will automatically archive it for you. This means all the usual steps of deletion will take place, except the task will be safely stored in our archive rather than being permanently removed. This also applies to any assignments within the task that have analytics data – they'll be preserved in the archive as well.

Dealing with Datasets: When it comes to datasets, if any tasks within the dataset have been active or have analytics data, rest assured they won't be deleted. This is to ensure that no significant information or insights are inadvertently lost.

Archive Flag: If a task or assignment is archived due to associated activity after the deletion, a status indicating it is archived (archive= true) will be added to its metadata.system. This status can be used to specifically search for archived tasks or assignments.

Analytics Performance Tab: For a comprehensive view, our Analytics Performance Tab will now include data from archived assignments. You'll be able to see details like the assignment name and the annotator involved, even for tasks that have been archived instead of deleted. Ensure you have Annotation Manager or above to view archived assignments.