Qualification & Honeypot
  • 03 Jun 2024
  • Dark
    Light
  • PDF

Qualification & Honeypot

  • Dark
    Light
  • PDF

Article summary

Qualification

How Qualification Works

Qualification tasks provide a tool to evaluate annotators' skills and performance by letting them work on a 'test' task that has ground-truth answers hidden from them. After completing their work (the assignment-completed event), scores can be calculated by comparing annotations in items in their assignment with ground-truth annotations from the original data.

Unlike consensus tasks, the annotator's work is not merged back into the original item, maintaining it as a clean qualification-test copy. By creating multiple qualification tests in the project context, managers can obtain insights into annotators' performances with various data (for example, image and video) and tools (box, polygon, etc.).

Setting up a qualification task

To set up a qualification task, follow these instructions:

  1. Start a new Annotation task and provide all the required information for Data and other steps. See Create an Annotation Task for more information.
  2. Switch to the Quality step and enable the Qualification feature.

Continuous Qualification Task

Qualification tasks by nature never ends. Any new user added to them as an assignee receives an assignment that includes all items. This allows for creating a qualification test once in a project and using it for testing new annotators along the project's lifecycle.

Qualification Score

When selecting to enable qualification score, a Pipeline is created with 2 nodes:

  1. Task node - the qualification task itself
  2. Application (FaaS) node - running Dataloop defualt score function.

Dataloop default score function is in our GIT repository, and includes documentation of how score are calculated and saved for the different annotation types.

As such, you can fork our GIT repo and customize the score function to facilitate your custom logic, then add it as a new Application (FaaS) and place it in the Pipeline instead of the default Dataloop score function.

To learn more, contact the Dataloop team.

Honeypot

How Honeypot Works

While Qualifications tasks provide a means to evaluate annotators' quality and performance as taking a test, Honeypot tasks provide a way to continuously monitor annotators' performance, quality, and accuracy as part of their ongoing work by planting items with known ground truth in tasks containing raw data. The score is calculated over the honeypot items and provides an estimate of the accuracy and quality that can be expected over raw images.

Setting Up a Honeypot Task

To set up a honeypot task, follow these instructions:

  1. Prepare ground-truth data in the /honeypot folder of the same dataset where the raw data resides.
  2. Start a new Annotation task and provide all the required information for Data and other steps. See Create an Annotation Task for more information.
  3. Switch to the Quality step and enable the Honeypot feature.
  4. Set the percentage of honeypot items out of all items in the task. For example, setting it to 5% means that 5 out of every 100 items will be items with ground truth.

Note: If there are not enough honeypot items to cover the percentage requested, more raw items will be used.

Honeypot Score

Honeypot scores can be integrated as part of a FaaS (with an event trigger or in a pipeline) and provide the annotator's score. Dataloop provides a default score function, but any custom function that runs over the assignment items (taking only the relevant honeypot ones) can be used for that purpose.

To learn more about the Dataloop Scoring Function, see the Scoring and metrics app readme file.