- 03 Jun 2024
- Print
- DarkLight
- PDF
Consensus
- Updated On 03 Jun 2024
- Print
- DarkLight
- PDF
Overview
The Consensus is an important quality control feature, allowing you to compare annotations by different users on a specific item and generating majority-vote based, high-quality data.
How Consensus Works
When enabling consensus on an annotation task, it is set with the percentage of data to cover and the number of assignees. Dataloop will create copies of the data and randomly assign it to users.
- While browsing a task, you will see only the original items in the task, not the copies.
- While browsing an assignment, you will see the specific copies assigned to the user.
When all copies of an item have been assigned a status (for example, Completed), the system will merge all copies back onto the original item. Until that point, the original item will not show any of the annotations added to the copies.
When downloading data for consensus items, the JSON includes all annotations created by the different users, along with their usernames. This allows you to run your own score calculation and decide which annotations are defined as having higher quality.
Setting up a consensus task
To set up a task with consensus, follow the instructions:
Start a new Annotation task and provide all the required information for Data and other steps. See Create an Annotation Task for more information.
Switch to the Quality step and enable the consensus feature.
Define the Data for consensus up to 100% of the items in the task. As you change the percentage of items, the exact number of consensus items is calculated and shown.
Define the Users for consensus as the number of copies per item (minimum 2).
Working with consensus means more items are annotated per task. The calculator is there to assist in understanding the scope of the task.
1. Original items: This is the number of items you selected as Data for this task.
2. Consensus items: The number of additional copies included in the task. Since one copy is already considered part of the original data, this shows the additional payload per task.
3. Total number of items: It allows you to estimate the schedule and cost.
Consensus QA
Consensus work is supported in the QA workflow. The reviewers can work on items containing work from multiple users and flag issues on annotations, or create note-annotations to trigger correction work and increase the annotations quality.
When annotators correct their work and apply Complete status to the item, their annotations are removed from the master item, and the new, corrected ones are copied over to represent the corrected, higher-quality work. Reviewers can then review their work and set an Approve status on the item.
Consensus Score
Quality task scoring is available now for items, annotations, and users. When you activate a Consensus task while creating a task, a score function is enabled. All users, items, and annotations are scored via the function. The new Scores tab on the Analytics page gives you more information.
The consensus score can be integrated as part of a FaaS in a pipeline and provide you with benefits such as:
- Calculate the consensus score based on your custom method and threshold, for example, IOU, etc.
- Move/clone items/annotations to a majority-vote dataset for use as train/test sets.
- Trigger annotations with a high score for further processing in the pipeline.
To learn more about the Dataloop Scoring Function, see the Scoring and metrics app README file.
Redistribute or Reassign Consensus Tasks
Dataloop consensus currently has the following limitations:
- To maintain consistency in the quality evaluation, consensus tasks cannot be redistributed.
- Consensus assignments can be reassigned to users that do not have or have ever had an assignment in the task (to prevent cases of two consensus copies by the same annotator). To reassign:
1. Open the Workflows > Tasks page.
2. Double-click on the consensus task.
3. Click the Ellipsis icon (3-dots) of the items for the respective assignment.
4. Select Reassign.
5. Select a new user from the list.