How to import assets and labels to a Kili project
In this tutorial, we will learn how to import assets, add metadata to them, and then import labels to your project.
Here are the steps that we will follow:
- Setting up a simple Kili project to work with
- Importing assets to Kili
- Adding metadata to assets
- Importing model-based pre-annotations into your project
- Importing pre-existing labels into your project
Setting up a simple Kili project to work with
Installing and instantiating Kili
First, let's install and import the required modules.
!pip install kili
from kili.client import Kili
import getpass
import os
Now, let's set up variables needed to create an instance of the Kili object.
We will need your API key and Kili's API endpoint.
If you are unsure how to look up your API key, refer to https://docs.kili-technology.com/docs/creating-an-api-key.
if "KILI_API_KEY" not in os.environ:
KILI_API_KEY = getpass.getpass("Please enter your API key: ")
else:
KILI_API_KEY = os.environ["KILI_API_KEY"]
With variables set up, we can now create an instance of the Kili object.
kili = Kili(
api_key=KILI_API_KEY, # no need to pass the API_KEY if it is already in your environment variables
# api_endpoint="https://cloud.kili-technology.com/api/label/v2/graphql",
# the line above can be uncommented and changed if you are working with an on-premise version of Kili
)
Creating a basic Kili project
To create a Kili project, you must first set up its interface.
We will create a simple image project with just one simple classification job and two categories: OBJECT_A
and OBJECT_B
.
To learn more about Kili project interfaces, refer to https://docs.kili-technology.com/docs/customizing-project-interface.
interface = {
"jobs": {
"JOB_0": {
"mlTask": "CLASSIFICATION",
"required": 1,
"content": {
"categories": {"OBJECT_A": {"name": "Object A"}, "OBJECT_B": {"name": "Object B"}},
"input": "radio",
},
}
}
}
result = kili.create_project(
title="Test Project",
description="Project Description",
input_type="IMAGE",
json_interface=interface,
)
For further processing, we will need to find out what our project ID is.
We can easily retrieve it from the project creation response message:
project_id = result["id"]
print("Project ID: ", project_id)
Project ID: cld90ffe80l8b0jn9d4vb44tn
Importing assets to Kili
Now, let's add some assets to be labeled.
We will use some free off-the-shelf examples from the Internet.
url1 = "https://storage.googleapis.com/label-public-staging/car/car_2.jpg"
url2 = "https://storage.googleapis.com/label-public-staging/car/car_1.jpg"
url3 = "https://storage.googleapis.com/label-public-staging/recipes/inference/black_car.jpg"
assets = kili.append_many_to_dataset(
project_id=project_id,
content_array=[url1, url2, url3],
external_id_array=["image_1", "image_2", "image_3"],
)
If you prefer to add your own images, you can use a local file. The code to do that would looks similar to this:
# Path to local image
project_id = 'project_id'
assets = kili.append_many_to_dataset(
project_id=project_id,
content_array=['./image_1.jpeg'],
external_id_array=['image_1']
)
The procedure looks the same for most of other data types, like PDFs or text. For more information on supported file formats, refer to our documentation.
Because videos and Rich Text assets may be more complex to import, we've created separate tutorials devoted to them:
- For information on importing video assets, refer to this tutorial.
- For information on importing Rich Text assets, see here.
For more information on importing assets, refer to our documentation.
Adding metadata to assets
In Kili, you can add extra information to an asset by using asset metadata. This can be information on document language, custom quality metrics, agreement metrics and so on that you can use, for example when using Kili's advanced filters or for Optical Character Recognition.
Additionally, three specific metadata types can be used as information presented to labelers in Kili interface:
imageUrl
text
url
First, let's set data types for each type of our metadata. The default data type is string
, so you can skip this step, but setting some of your metadata as numbers can really help when you want to filter your assets later.
Note that we don't need to set data types for imageUrl
, text
, and url
.
kili.update_properties_in_project(
project_id=project_id,
metadata_types={
"customConsensus": "number",
"sensitiveData": "string",
"uploadedFromCloud": "string",
"modelLabelErrorScore": "number",
},
)
{'id': 'cld90ffe80l8b0jn9d4vb44tn',
'metadataTypes': {'sensitiveData': 'string',
'customConsensus': 'number',
'uploadedFromCloud': 'string',
'modelLabelErrorScore': 'number'}}
Now we can add metadata to our assets:
external_ids = ["image_1", "image_2"]
kili.update_properties_in_assets(
project_id=project_id,
external_ids=external_ids,
json_metadatas=[
{
"customConsensus": 10,
"sensitiveData": "yes",
"uploadedFromCloud": "no",
"modelLabelErrorScore": 50,
},
{
"customConsensus": 40,
"sensitiveData": "no",
"uploadedFromCloud": "yes",
"modelLabelErrorScore": 30,
},
],
)
# Add metadata that will be visible to labelers in the labeling interface:
kili.update_properties_in_assets(
project_id=project_id,
external_ids=external_ids,
json_metadatas=[
{"imageUrl": "www.example.com/image.png", "text": "some text", "url": "www.example.com"},
{"imageUrl": "www.example.com/image.png", "text": "some text", "url": "www.example.com"},
],
)
[{'id': 'cld90fmg100009nvzbjvf73jx'}, {'id': 'cld90fmg100019nvzirlsutxz'}]
If you want to add metadata based on Optical Character Recognition, the process is slightly more complex. To help you with it, we've created a separate tutorial.
For more information on adding asset metadata, refer to our documentation.
Importing model-based pre-annotations into your project
When you import pre-annotations, you can use two types of labels: PREDICTION
and INFERENCE
.
PREDICTION
-type annotations will be displayed on the asset during the labeling process. Labelers will be able to confirm what model produced, edit, and/or add new annotations.
Unlike with PREDICTIONS
, INFERENCE
-type annotations are not displayed on the asset during the labeling process.
INFERENCE
-type labels are used for IoU (intersection over union) calculation if you want to benchmark your model predictions, or labelers' work quality against ground truth.
For more information on Kili label types, refer to our documentation.
First, let's prepare some fake predictions:
json_response_array = [
{"JOB_0": {"categories": [{"confidence": 95, "name": "OBJECT_B"}]}},
{"JOB_0": {"categories": [{"confidence": 79, "name": "OBJECT_A"}]}},
{"JOB_0": {"categories": [{"confidence": 83, "name": "OBJECT_B"}]}},
]
Now, we'll upload them:
asset_external_id_array = ["image_1", "image_2", "image_3"]
kili.append_labels(
json_response_array=json_response_array,
model_name="MyModel",
label_type="PREDICTION",
project_id=project_id,
asset_external_id_array=asset_external_id_array,
)
kili.append_labels(
json_response_array=json_response_array,
model_name="MyModel",
label_type="INFERENCE",
project_id=project_id,
asset_external_id_array=asset_external_id_array,
)
[{'id': 'cld90g9tu0l690joi5hnbc3rj'},
{'id': 'cld90g9tu0l6a0joidtsk09q4'},
{'id': 'cld90g9tu0l6b0joi1qp69ipz'}]
You can add prediction-type labels directly, using the create_predictions
method. The label_type
will be assigned automatically as PREDICTION
:
external_id_array = ["image_1", "image_2", "image_3"]
kili.create_predictions(
project_id=project_id,
external_id_array=external_id_array,
json_response_array=json_response_array,
model_name="MyModel",
)
{'id': 'cld90ffe80l8b0jn9d4vb44tn'}
Importing pre-existing labels into your project
You can also use existing labels to add them to assets. Here, we'll use existing labels and add them to our assets only if their external IDs match.
external_id_array = ["image_1", "image_2", "image_3"]
assets_list = kili.assets(project_id=project_id)
for asset in assets_list:
if asset["externalId"] in external_id_array:
kili.append_labels(
json_response_array=[asset["labels"][-1]["jsonResponse"]],
label_type="DEFAULT",
project_id=project_id,
asset_external_id_array=[asset["externalId"]],
)
Cleanup
We can remove the project that we created:
kili.delete_project(project_id);
Summary
Done. We've successfully set up a Kili project, imported assets to it, added metadata to our assets, and then imported various types of labels to our project. Well done!