{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Managing Datacards" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The Bailo python client enables intuitive interaction with the Bailo service, from within a python environment. This example notebook will run through the following concepts:\n", "\n", "* Creating and populating a new datacard on Bailo.\n", "* Retrieving datacards from the service.\n", "* Making changes to the datacard.\n", "\n", "Prerequisites:\n", "\n", "* Python 3.9 or higher (including a notebook environment for this demo).\n", "* A local or remote Bailo service (see https://github.com/gchq/Bailo)." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The Bailo python client is split into two sub-packages: **core** and **helper**.\n", "\n", "* **Core:** For direct interactions with the service endpoints.\n", "* **Helper:** For more intuitive interactions with the service, using classes (e.g. Datacard) to handle operations.\n", "\n", "In order to create helper classes, you will first need to instantiate a `Client()` object from the core. By default, this object will not support any authentication. However, Bailo also supports PKI authentication, which you can use from Python by passing a `PkiAgent()` object into the `Client()` object when you instantiate it." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Necessary import statements\n", "# Install dependencies...\n", "! pip install bailo\n", "\n", "from bailo import Datacard, Client\n", "\n", "# Instantiating the PkiAgent(), if using.\n", "# agent = PkiAgent(cert='', key='', auth='')\n", "\n", "# Instantiating the Bailo client\n", "\n", "client = Client(\"http://127.0.0.1:8080\") # <- INSERT BAILO URL (if not hosting locally)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a new datacard in Bailo\n", "\n", "### Creating and updating the base datacard\n", "\n", "In this section, we will create a new datacard using the `Datacard.create()` classmethod. On the Bailo service, a datacard must consist of at least 4 parameters upon creation. These are **name**, **description** and **visibility**. Below, we use the `Client()` object created before when instantiating the new `Datacard()` object. \n", "\n", "NOTE: This creates the datacard on your Bailo service too! The `datacard_id` is assigned by the backend, and we will use this later to retrieve the datacard. *Like with models on Bailo, the actual datacard has not been populated at this stage.*" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "datacard = Datacard.create(client=client, name=\"ImageNet\", description=\"ImageNet dataset consisting of images.\")\n", "\n", "datacard_id = datacard.datacard_id" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "You may make changes to these attributes and then call the `update()` method to relay the changes to the service, as below:\n", "\n", "```python\n", "datacard.name = \"New Name\"\n", "datacard.update()\n", "```\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Populating the datacard\n", "\n", "When creating a datacard, first we need to generate an empty card using the `card_from_schema()` method. In this instance, we will use **minimal-data-card-v10**. You can manage custom schemas using the `Schema()` helper class, but this is out of scope for this demo." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "datacard.card_from_schema(schema_id='minimal-data-card-v10')\n", "\n", "print(f\"Datacard version is {datacard.data_card_version}.\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "If successful, the above will have created a new datacard, and the `data_card_version` attribute should be set to 1.\n", "\n", "Next, we can populate the data using the `update_data_card()` method. This can be used any time you want to make changes, and the backend will create a new datacard version each time. We will learn how to retrieve datacards later (either the latest, or a specific release).\n", "\n", "NOTE: Your datacard must match the schema, otherwise an error will be thrown." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_card = {\n", " 'overview': {\n", " 'storageLocation': 'S3',\n", " }\n", "}\n", "\n", "datacard.update_data_card(data_card=new_card)\n", "\n", "print(f\"Datacard version is {datacard.data_card_version}.\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "If successful, the `data_card_version` will now be 2!" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Retrieving an existing datacard\n", "\n", "### Using the .from_id() method\n", "\n", "In this section, we will retrieve our previous datacard using the `Datacard.from_id()` classmethod. This will create your `Datacard()` object as before, but using existing information retrieved from the service." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "datacard = Datacard.from_id(client=client, datacard_id=datacard_id)\n", "\n", "print(f\"Datacard description: {datacard.description}\")" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }