Skip to content

Open In Colab

How to set up and manage workflows with Kili

In this tutorial, we will learn how to set up basic Kili workflows:

  1. Managing reviews
    1. Placing a specific percentage of project assets in the review queue
    2. Placing specific assets in the review queue
    3. Sending an asset back to the labeling queue
  2. Setting up consensus
    1. Setting consensus for a specific percentage of project assets
    2. Setting consensus for specific assets to compute consensus KPIs
  3. Setting up honeypot
  4. Assigning labelers to assets
  5. Prioritizing assets in the labeling queue

To work with this notebook, you will have to install and instantiate Kili.

%pip install kili
from kili.client import Kili
kili = Kili(
    # api_endpoint="https://cloud.kili-technology.com/api/label/v2/graphql",
    # the line above can be uncommented and changed if you are working with an on-premise version of Kili
)

For information on how to set up a Kili project, refer to the basic project setup tutorial.

Managing reviews

Placing a specific percentage of project assets in the review queue

You can set up the percentage of assets that will automatically appear in the review queue (1-100%).

kili.update_properties_in_project(project_id=project_id, review_coverage=50)
{'reviewCoverage': 50, 'id': 'clnwvhuu000cz088xcqxz1dig'}

Setting up consensus

Consensus works by having more than one labeler annotate the same asset. When the asset is labeled, a consensus score is calculated to measure the agreement level between the different annotations for a given asset. This is a key measure for controlling label production quality.

To set up consensus, you will need to have at least two project members. For information on how to add users and assign them to your project, refer to the basic project setup tutorial.

Setting consensus for a specific percentage of project assets

Let's set the percentage of the project dataset that will be annotated several times, to enable consensus calculations. We will also set the minimum number of labelers to label each one of these assets.

kili.update_properties_in_project(
    project_id=project_id,
    consensus_tot_coverage=1,
    min_consensus_size=3,
)
{'consensusTotCoverage': 1,
 'minConsensusSize': 3,
 'id': 'clnwvhuu000cz088xcqxz1dig'}

Setting consensus for specific assets to compute consensus KPIs

You can manually select specific project assets to be used for computing consensus KPIs.

kili.update_properties_in_assets(
    project_id=project_id,
    external_ids=["1.jpg", "2.jpg", "3.jpg"],
    is_used_for_consensus_array=[True] * 3,
)
[{'id': 'clnwvhvo00000gsvzinsato00'},
 {'id': 'clnwvhvo00001gsvzsiqcx5dc'},
 {'id': 'clnwvhvo00002gsvzzbjtyuif'}]

For more information on consensus, refer to our documentation.

Setting up honeypot

Honeypot (or gold standard) is a tool for auditing the work of labelers by measuring the accuracy of their annotations. Honeypot works by interspersing assets with defined ground truth label in the annotation queue. This way you can measure the agreement level between your ground truth and the annotations made by labelers.

First, we need to enable honeypot for our project:

kili.update_properties_in_project(project_id=project_id, use_honeypot=True)
{'useHoneyPot': True, 'id': 'clnwvhuu000cz088xcqxz1dig'}

You can now manually select specific project assets to be used as honeypots:

kili.create_honeypot(
    project_id=project_id,
    asset_external_id="1.jpg",
    json_response={"JOB_0": {"categories": [{"confidence": 100, "name": "OBJECT_B"}]}},
)

For more information on honeypot, refer to our documentation.

Assigning labelers to assets

You can assign specific labelers to specific assets in your project. You can do that by assigning users' emails to the selected asset IDs. Remember that you can assign more than one user to a specific asset.

kili.update_properties_in_assets(
    project_id=project_id,
    external_ids=["1.jpg", "2.jpg", "3.jpg"],
    to_be_labeled_by_array=[
        ["example1@example.com"],
        ["example2@example.com"],
        ["example3@example.com"],
    ],
)
[{'id': 'clnwvhvo00000gsvzinsato00'},
 {'id': 'clnwvhvo00001gsvzsiqcx5dc'},
 {'id': 'clnwvhvo00002gsvzzbjtyuif'}]

The to_be_labeled_by_array argument is a list of lists. Each of the sub-lists can contain several e-mails. This way you can assign several labelers to one asset.

For example:

to_be_labeled_by_array = [["example1@example.com"], ["example1@example.com", "example2@example.com"], ["example3@example.com"]]

For information on how to add users and assign them to your project, refer to the basic project setup tutorial. For information on assigning assets to users, refer to our documentation.

Prioritizing assets in the labeling queue

If you have certain assets that you need to have labeled earlier or later than the rest, you can use Kili's asset prioritization methods.

kili.update_properties_in_assets(
    project_id=project_id, external_ids=["1.jpg", "2.jpg", "3.jpg"], priorities=[1, 5, 10]
)
[{'id': 'clnwvhvo00000gsvzinsato00'},
 {'id': 'clnwvhvo00001gsvzsiqcx5dc'},
 {'id': 'clnwvhvo00002gsvzzbjtyuif'}]

For information on setting asset priorities, refer to our documentation.

Placing specific assets in the review queue

When done with your basic workflow setup, you can place specific, labeled assets in the review queue. As this requires the assets to be labeled, first, let's simulate adding labels to some of our assets. The method will return the list of newly-added label IDs.

kili.append_labels(
    project_id=project_id,
    asset_external_id_array=["4.jpg"],
    json_response_array=[{"JOB_0": {"categories": [{"confidence": 100, "name": "OBJECT_B"}]}}],
    label_type="DEFAULT",
)

Now, let's place some assets in the review queue. The method will return a project ID and a list of asset IDs placed in the review queue.

kili.add_to_review(project_id=project_id, external_ids=["4.jpg"])

For more information on asset statuses, refer to our documentation.

Sending an asset back to the labeling queue

You can also send specific labeled assets back to the labeling queue.

kili.send_back_to_queue(project_id=project_id, external_ids=["4.jpg"])
{'id': 'clm0sxbgb05hf082tcq813zrc', 'asset_ids': ['clm0sxcpf0003ojvzcrha754l']}

For more information on asset statuses, refer to our documentation.

Cleanup

We can remove the project that we created:

kili.delete_project(project_id)
'clm0sxbgb05hf082tcq813zrc'

Summary

Done!

We have learned how to handle the review workflow, set up consensus and honeypot in a project, assign specific labelers to specific assets, and how to prioritize assets in the labeling queue.