How to import assets to a Kili project
In this tutorial, we will learn how to import assets to your project, and add metadata to those assets.
Here are the steps that we will follow:
- Setting up a simple Kili project to work with
- Importing assets to Kili
- Adding metadata to assets
Setting up a simple Kili project to work with
Installing and instantiating Kili
First, let's install and import the required modules.
%pip install kili
from kili.client import Kili
Now, let's set up variables needed to create an instance of the Kili object.
We will need your API key and Kili's API endpoint.
If you are unsure how to look up your API key, refer to https://docs.kili-technology.com/docs/creating-an-api-key.
kili = Kili(
# api_endpoint="https://cloud.kili-technology.com/api/label/v2/graphql",
# the line above can be uncommented and changed if you are working with an on-premise version of Kili
)
Creating a basic Kili project
To create a Kili project, you must first set up its interface.
We will create a simple image project with just one simple classification job and two categories: OBJECT_A
and OBJECT_B
.
To learn more about Kili project interfaces, refer to https://docs.kili-technology.com/docs/customizing-project-interface.
interface = {
"jobs": {
"JOB_0": {
"mlTask": "CLASSIFICATION",
"required": 1,
"isChild": False,
"content": {
"categories": {"OBJECT_A": {"name": "Object A"}, "OBJECT_B": {"name": "Object B"}},
"input": "radio",
},
}
}
}
project = kili.create_project(
title="[Kili SDK Notebook]: Importing assets with metadata",
description="Project Description",
input_type="IMAGE",
json_interface=interface,
)
For further processing, we will need to find out what our project ID is.
We can easily retrieve it from the project creation response message:
project_id = project["id"]
print("Project ID: ", project_id)
Project ID: cllamrwgl00670j393poh2t4j
Importing assets to Kili
Now, let's add some assets to be labeled.
We will use some free off-the-shelf examples from the Internet.
url1 = "https://storage.googleapis.com/label-public-staging/car/car_2.jpg"
url2 = "https://storage.googleapis.com/label-public-staging/car/car_1.jpg"
url3 = "https://storage.googleapis.com/label-public-staging/recipes/inference/black_car.jpg"
assets = kili.append_many_to_dataset(
project_id=project_id,
content_array=[url1, url2, url3],
external_id_array=["image_1", "image_2", "image_3"], # name to give to assets
)
At this point, you should be able to see your assets in your Kili project:
If you prefer to add your own images, you can use a local file. The code to do that would look similar to this:
project_id = 'project_id'
assets = kili.append_many_to_dataset(
project_id=project_id,
content_array=['./image_1.jpeg'], # Path to local image
external_id_array=['image_1']
)
The procedure looks the same for most of other data types, like PDFs or text. For more information on supported file formats, refer to our documentation.
Because videos and Rich Text assets may be more complex to import, we've created separate tutorials devoted to them:
- For information on importing video assets, refer to this tutorial.
- For information on importing Rich Text assets, see here.
For more information on importing assets, refer to our documentation.
Adding metadata to assets
In Kili, you can add extra information to an asset by using asset metadata. The metadata can contain extra information like what language the document was written in, custom quality metrics, agreement metrics etc., and can be used with Kili's advanced filters. The metadata can also contain text extracted from images or PDF documents using OCR (Optical Character Recognition), and that will be shown in the Kili labeling interface.
Additionally, three specific metadata types can be used as information presented to labelers in Kili interface:
imageUrl
text
url
Setting metadata properties
As an optional step, you can define properties for each type of your metadata. These properties allow you to control:
- The data type (
string
ornumber
) - Whether the metadata is filterable in project queue
- Visibility of each metadata to labelers and reviewers
kili.update_properties_in_project(
project_id=project_id,
metadata_properties={
"customConsensus": {
"type": "number",
"filterable": True,
"visibleByLabeler": True,
"visibleByReviewer": True,
},
"sensitiveData": {
"type": "string",
"filterable": True,
"visibleByLabeler": False, # Hide this from labelers
"visibleByReviewer": True,
},
"uploadedFromCloud": {
"type": "string",
"filterable": True,
"visibleByLabeler": True,
"visibleByReviewer": True,
},
"modelLabelErrorScore": {
"type": "number",
"filterable": True,
"visibleByLabeler": True,
"visibleByReviewer": True,
},
},
)
Note: The previous
metadata_types
parameter is deprecated. Please use metadata_properties instead. If you use metadata_types, it will still work but will be converted to metadata_properties internally with default visibility and filterability settings.
If you don't specify all properties, default values will be used:
filterable: true
type: 'string'
visibleByLabeler: true
visibleByReviewer: true
Now we can add metadata to our assets:
external_ids = ["image_1", "image_2"]
kili.update_properties_in_assets(
project_id=project_id,
external_ids=external_ids,
json_metadatas=[
{
"customConsensus": 10,
"sensitiveData": "yes",
"uploadedFromCloud": "no",
"modelLabelErrorScore": 50,
"imageUrl": "https://placehold.co/600x400/EEE/31343C",
"text": "Some text for asset 1",
"url": "www.example-website.com",
},
{
"customConsensus": 40,
"sensitiveData": "no",
"uploadedFromCloud": "yes",
"modelLabelErrorScore": 30,
"imageUrl": "https://placehold.co/600x400/EEE/31343C",
"text": "Some text for asset 2",
"url": "www.example-website.com",
},
],
)
[{'id': 'cllams1oz0000jhvz4hyxy0en'}, {'id': 'cllams1p00001jhvz75twgx7d'}]
Note : alternatively, you can use
kili.set_metadata
orkili.add_metadata
methods.
In the labeling interface, we can see that the assets have some metadata (note that sensitiveData
will be hidden from labelers based on our settings).
If you want to add metadata based on Optical Character Recognition, the process is slightly different. To help you with it, we've created a separate tutorial.
For more information on adding asset metadata, refer to our documentation.
Cleanup
We can remove the project that we created:
kili.delete_project(project_id)
Summary
We've successfully set up a Kili project, imported assets to it, and finally added some metadata to our assets with advanced property settings. Well done!