{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Managing Models & Releases (ResNet-50 Example with PyTorch)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The Bailo python client enables intuitive interaction with the Bailo service, from within a python environment. This example notebook will run through the following concepts:\n", "\n", "* Creating a new model on Bailo.\n", "* Creating and populating a model card.\n", "* Retrieving models from the service.\n", "* Making changes to the model, and model card.\n", "* Creating and managing specific releases, with files.\n", "\n", "Prerequisites:\n", "\n", "* Python 3.9 or higher (including a notebook environment for this demo).\n", "* A local or remote Bailo service (see https://github.com/gchq/Bailo).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The Bailo python client is split into two sub-packages: **core** and **helper**.\n", "\n", "* **Core:** For direct interactions with the service endpoints.\n", "* **Helper:** For more intuitive interactions with the service, using classes (e.g. Model) to handle operations.\n", "\n", "In order to create helper classes, you will first need to instantiate a `Client()` object from the core. By default, this object will not support any authentication. However, Bailo also supports PKI authentication, which you can use from Python by passing a `PkiAgent()` object into the `Client()` object when you instantiate it.\n", "\n", "**IMPORTANT: Select the relevant pip install command based on your environment.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# LINUX - CPU\n", "! pip install bailo torch torchvision --index-url https://download.pytorch.org/whl/cpu\n", "\n", "# MAC & WINDOWS - CPU\n", "#! pip install bailo torch torchvision" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Necessary import statements\n", "\n", "from bailo import Model, Client\n", "import torch\n", "from torchvision.models import resnet50, ResNet50_Weights\n", "\n", "# Instantiating the PkiAgent(), if using.\n", "# agent = PkiAgent(cert='', key='', auth='')\n", "\n", "# Instantiating the Bailo client\n", "\n", "client = Client(\"http://127.0.0.1:8080\") # <- INSERT BAILO URL (if not hosting locally)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a new ResNet-50 model in Bailo\n", "\n", "### Creating and updating the base model\n", "\n", "In this section, we will create a new model representing ResNet-50 using the `Model.create()` classmethod. On the Bailo service, a model must consist of at least 4 parameters upon creation. These are **name**, **description** and **visibility**. Other attributes like model cards, files, or releases are added later on. Below, we use the `Client()` object created before when instantiating the new `Model()` object. \n", "\n", "NOTE: This creates the model on your Bailo service too! The `model_id` is assigned by the backend, and we will use this later to retrieve the model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model = Model.create(client=client, name=\"ResNet-50\", description=\"ResNet-50 model for image classification.\")\n", "\n", "model_id = model.model_id" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "You may make changes to these attributes and then call the `update()` method to relay the changes to the service, as below:\n", "\n", "```python\n", "model.name = \"New Name\"\n", "model.update()\n", "```\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Creating and populating a model card\n", "\n", "When creating a model card, first we need to generate an empty one using the `card_from_schema()` method. In this instance, we will use **minimal-general-v10**. You can manage custom schemas using the `Schema()` helper class, but this is out of scope for this demo." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model.card_from_schema(schema_id='minimal-general-v10')\n", "\n", "print(f\"Model card version is {model.model_card_version}.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating and populating a model card\n", "\n", "When creating a model card, first we need to generate an empty one using the `card_from_schema()` method. In this instance, we will use **minimal-general-v10**. You can manage custom schemas using the `Schema()` helper class, but this is out of scope for this demo.\n", "\n", "### Creating and populated a new model card with a template\n", "\n", "When creating a model card from a template, we need to use a pre-existing model card as our template. First we create a new model, to create the new model card we use the `card_from_template` method and pass our chosen template model's ID. \n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model2 = Model.create(\n", " client=client, name=\"ResNet-50\", description=\"ResNet-50 model for image classification.\"\n", ")\n", "\n", "model2_id = model2.model_id\n", "\n", "model2.card_from_template(model.model_id)\n", "\n", "print(f\"Model name %s\", model2.name)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "If successful, the above will have created a new model card, and the `model_card_version` attribute should be set to 1.\n", "\n", "Next, we can populate the model card using the `update_model_card()` method. This can be used any time you want to make changes, and the backend will create a new model card version each time. We will learn how to retrieve model cards later (either the latest, or a specific release).\n", "\n", "NOTE: Your model card must match the schema, otherwise an error will be thrown." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_card = {\n", " 'overview': {\n", " 'tags': [],\n", " 'modelSummary': 'ResNet-50 model for image classification.',\n", " }\n", "}\n", "\n", "model.update_model_card(model_card=new_card)\n", "\n", "print(f\"Model card version is {model.model_card_version}.\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "If successful, the `model_card_version` will now be 2!" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Retrieving an existing model\n", "\n", "### Using the .from_id() method\n", "\n", "In this section, we will retrieve our previous model using the `Model.from_id()` classmethod. This will create your `Model()` object as before, but using existing information retrieved from the service." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model = Model.from_id(client=client, model_id=model_id)\n", "\n", "print(f\"Model description: {model.description}\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "If successful, the model description we set earlier should be displayed above." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Creating and managing releases for models\n", "\n", "On the Bailo service, different versions of the same model are managed using **releases**. Generally, this is for code changes and minor adjustments that don't drastically change the behaviour of a model. In this section we will create a **release** and upload a file.\n", "\n", "### Creating a release\n", "\n", "`Release()` is a separate helper class in itself, but we can use our `Model()` object to create and retrieve releases. Running the below code will create a new release of the model, and return an instantiated `Release()` object which we will use to upload files with." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "release_one = model.create_release(version='1.0.0', notes='Initial model weights.')" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Preparing the model weights using PyTorch\n", "\n", "In order to upload the ResNet50 model to Bailo, we must first retrieve the weights from PyTorch and save them to a **BytesIO** object. The `Release.upload()` method takes a **BytesIO** object, and the `torch.save()` method allows us to do this directly without the need to use up local disk space." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "torch_model = resnet50(weights=ResNet50_Weights.DEFAULT)\n", "torch.save(torch_model.state_dict(), 'resnet50_weights.pth')" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Uploading weights to the release\n", "\n", "To upload files for a release, we can use the release `upload()` method which will take a file name, and a `BytesIO` type containing the file contents. In this case, we're using the **resnet50_weights.pth** we prepared in the last step.\n", "\n", "NOTE: The `upload()` method takes a `BytesIO` type to allow for other integrations, such as with S3." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "release_one.upload(path=\"resnet50_weights.pth\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Retrieving a release\n", "\n", "We can retrieve the latest release for our **ResNet-50** model using the model `get_latest_release()` method. Alternatively, we can retrieve a specific release using the model `get_release()` method. Both of these will return an instantiated `Release()` object." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "release_latest = model.get_latest_release()\n", "release_one = model.get_release(version='1.0.0')\n", "\n", "#To demonstrate this is the same release:\n", "if release_latest == release_one:\n", " print(\"Successfully retrieved identical releases!\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Downloading weights from the release\n", "\n", "You can also download specific files from release using the `download()` method. In this case, we will write them to a new file: **bailo_resnet50_weights.pth**. **NOTE**: `filename` refers to the filename on Bailo, and `path` is the local destination for your download.\n", "\n", "In addition to this, you can also use the `download_all()` method by providing a local directory path as `path`. By default, this will download all files, but you can provide `include` and `exclude` lists, e.g. `include=[\"*.txt\", \"*.json\"]` to only include TXT or JSON files. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#release_latest.download(filename=\"resnet50_weights.pth\", path=\"bailo_resnet50_weights.pth\")\n", "release_latest.download_all(path=\"downloads\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Loading the model using PyTorch\n", "\n", "Finally, now we've retrieved the ResNet-50 weights from our Bailo release, we can load them in using the **torch** library." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "weights = torch.load(\"downloads/resnet50_weights.pth\")\n", "torch_model = resnet50()\n", "torch_model.load_state_dict(weights)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "If the message \"**All keys matched successfully**\" is displayed, we have successfully initiated our model." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Searching for models" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "In addition to fetching specific models, you can also use the `Model.search()` method to return a list of `Model()` objects that match your parameters. These parameters can be:\n", "\n", "* Task of the model (e.g. image classification).\n", "* Libraries used for the model (e.g. PyTorch).\n", "* Model card search (string to be found in model cards).\n", "\n", "In the below example, we will just search for all models with no filters." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "models = Model.search(client=client)\n", "\n", "print(models)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "We should now have a list of `Model()` objects." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 2 }