Structuring a Python package to run in Google Cloud ML Engine

Before getting started with the specifics of developing custom models for Criterion AI, we recommend reading the following three articles from Google's documentation on Cloud ML Engine:

  1. Developing a TensorFlow Training Application

  2. Packaging a Training Application

  3. Running a Training Job

These three articles will give a solid understanding of the foundation that serves Criterion AI. The following sections in this article will make a lot more sense after having read Google's documentation first. Thus, you should start there and then come back here, once you have gotten a fundamental understanding of how the training part of Google Cloud ML Engine works.

Setting up your Python project

The first thing you should be doing when initiating your custom model project is to create the appropriate structure for your Python package (following the recommendations of the second article). Create your setup.py file and place it in the root of your package directory like shown in the figure from Google's article (reproduced below).

Below, you will see an example of what the setup.py file could look like. The example comes from Criterion AI's image classification model.

from setuptools import find_packages
from setuptools import setup

REQUIRED_PACKAGES = ['keras==2.2.4',
                     'h5py==2.8.0',
                     'Pillow==5.2.0',
                     'altair==2.3',
                     'scikit-learn',
                     'google-cloud-storage==1.13.2',
                     'scipy==1.2.0',
                     'imageio==2.5.0',
                     'gcsfs==0.2.1']

setup(
    name='criterion-packages-imageclassification',
    version='0.5',
    author = 'Michael Sass Hansen',
    author_email = '[email protected]',
    packages=find_packages(),
    description='This model trains an image classifier.',
    include_package_data=True,
    install_requires=REQUIRED_PACKAGES
)

You can make use of any Python package available from the Python Package Index. Many relevant packages are already installed in Google Cloud ML Engine's runtime environment so you do not need to specify packages that are already installed. You can see an overview of the different versions of runtime versions that Cloud ML Engine offers by following this link. It is a good idea to decide upon a specific version of the runtimes offered by ML Engine before initiating the actual development of your custom model.

The three major machine learning frameworks offered in ML Engine are TensorFlow, scikit-learn, and XGBoost. However, you can build your model using any framework of your choosing (e.g., PyTorch or Keras). The only requirement we have is that you output your model in TensorFlow's SavedModel format once it has completed training. You can do that easily by converting your model from a framework supported by the ONNX initiative to a protobuf file containing the TensorFlow SavedModel. See the list of supported frameworks supported by ONNX on their Supported Tools page.

Learn more saving models in Criterion AI in the article named: Serializing and storing models

Training and testing scripts

After having set up an appropriate structure for your model, you can make use of all of the usual tools you know from Python to create your training and testing scripts. When publishing your custom model through Criterion AI, we require you to provide two modules: one for training and one for testing. The first module should make use of the dataset(s) provided by the user to train the actual model while the second module should use the trained model and perform a test on a specific dataset (also chosen by the user).

When the user clicks on the Starting training button from Criterion AI's web interface, the following command will be executed to initiate the training of your model.

gcloud ml-engine jobs submit training $JOB_NAME \
        --scale-tier $SCALE_TIER \
        --package-path $TRAINER_PACKAGE_PATH \
        --module-name $MAIN_TRAINER_MODULE \
        --job-dir $JOB_DIR \
        --region $REGION \
        --python-version $PYTHON_VERSION \
        --runtime-version $RUNTIME_VERSION \
        -- \
        --model_id=$MODEL_ID

See the table below for an explanation of the variables.

Variable Explanation
$JOB_NAME The name of the job is generated automatically based on this pattern: job_ [MODEL_ID_WITHOUT_HYPHENS]_[UNIX_TIMESTAMP]
$SCALE_TIER The scale tier is decided by you when setting up the model though Criterion AI's web interface. The possible options are: BASIC, BASIC_GPU, and BASIC_TPU
$TRAINER_PACKAGE_PATH This is the location of your custom model (packaged as a gzipped tarball) in Google Cloud Storage. You will have to provide the path to your module (starting with gs://) when setting up your model in Criterion AI's web interface.
$MAIN_TRAINER_MODULE This is the module of your training script. If you have followed the example laid out by Google (as seen in the figure at the beginning of this article), the value of this variable should be  trainer.task. You get to decide this value in the creation of the model type in Criterion AI's web interface.
$JOB_DIR This will be the path to the bucket in Google Cloud Storage where you can store your model artifacts and other files. The value of the job directory will follow this pattern: gs:// [MODEL_ID_WITH_HYPHENS].models.criterion.ai
$REGION This variable will default to  europe-west1 unless you have chosen BASIC_TPU as your scale tier. In such a case, the region will be set to us-central1. Learn more about the regions supported by Cloud ML Engine on Google's documentation site.
$PYTHON_VERSION This is the version of Python, which should be used by Cloud ML Engine when running your script. You get to decide this value, which can either be 2.7 or 3.5.
$RUNTIME_VERSION This is the version of the Cloud ML Engine runtime environment. You get to decide this value and it should be one of the available versions from Cloud ML Engine's Runtime Version List.
$MODEL_ID This will be the ID of the model generated automatically by Criterion AI.

From your training script, you will be able to retrieve the values of all of these variables. In addition to these, you will also be able to read information from a special file named info.json. This file will be placed in the root of the job directory ($JOB_DIR) and will contain information on which datasets the model should be trained on along with some URLs for posting back data to Criterion AI and settings provided by the user (described in the following section). Below, you can see an example of what an info.json file for a model based on Criterion AI's image classification model type might look like:

{
  "id": "cc2e041d-737e-428e-92e3-a981a291e45b",
  "name": "Sample Model",
  "datasets": [
    {
      "id": "8f1731e4-6606-4382-90a6-8308066ed498",
      "bucket": "gs://8f1731e4-6606-4382-90a6-8308066ed498.datasets.criterion.ai",
      "name": "Sample Dataset"
    }
  ],
  "settings": {
    "imageWidth": 224,
    "imageHeight": 224,
    "type": "rgb",
    "rotationRange": 0,
    "horizontalSymmetry": false,
    "verticalSymmetry": false,
    "classWeights": "",
    "ignoreClasses": "",
    "img_format": "jpeg"
  },
  "api_key": "2O57R13JR6hfV9k17X0zT59E",
  "host_name": "https://app.criterion.ai",
  "charts_url": "/api/models/cc2e041d-737e-428e-92e3-a981a291e45b/charts?key=2O57R13JR6hfV9k17X0zT59E",
  "complete_url": "/api/models/cc2e041d-737e-428e-92e3-a981a291e45b/complete?key=2O57R13JR6hfV9k17X0zT59E&modelPath=",
  "remote_url": "/api/models/cc2e041d-737e-428e-92e3-a981a291e45b/remotemonitor?key=2O57R13JR6hfV9k17X0zT59E"
}

Your training script should be able to run using the values of the variables provided via the gcloud command along with information from the info.json file.

Similarly, when a user initiates the creation of a test report from Criterion AI's web interface, the following command will be executed:

gcloud ml-engine jobs submit training $JOB_NAME \
        --scale-tier $SCALE_TIER \
        --package-path $TRAINER_PACKAGE_PATH \
        --module-name $MAIN_TESTER_MODULE \
        --job-dir $JOB_DIR \
        --region $REGION \
        --python-version $PYTHON_VERSION \
        --runtime-version $RUNTIME_VERSION \
        -- \
        --testreport_id=$MODEL_ID \
        --info_file=$INFO_FILE

Many of the variables will contain values similar to the command executed for initiating the training script. However, there are some differences, which can be seen in the tables below.

Variable Explanation
$MAIN_TESTER_MODULE This is the module of your testing script. If you have followed the example laid out by Google (as seen in the figure at the beginning of this article), the value of this variable should be  trainer.test. You get to decide this value in the creation of the model type in Criterion AI's web interface.
$TESTREPORT_ID This will be the ID of the test report generated automatically by Criterion AI.
$INFO_FILE This will be the name of the info file required by your testing script, which will be placed in the root of the $JOB_DIR. The name of this file will follow the pattern: testreport_ [TEST_REPORT_ID_WITH_HYPHENS].json

Just like the info file generated to training jobs, the info file for testing jobs will include information needed by your testing script. Below, you can see an example of what this file might look like.

{
  "id": "ee4447ec-3719-4bb1-96c4-313335527a11",
  "dataset": {
    "id": "f01eda35-d537-4e5e-9298-6366ddb702f9",
    "bucket": "gs://f01eda35-d537-4e5e-9298-6366ddb702f9.datasets.criterion.ai",
    "name": "Test Dataset"
  },
  "api_key": "2O57R13JR6hfV9k17X0zT59E",
  "host_name": "https://app.criterion.ai",
  "charts_url": "/api/models/c0f4dcb9-7bee-4c5e-ae17-3d63aa43ac1b/testreports/ee4447ec-3719-4bb1-96c4-313335527a11/charts?key=2O57R13JR6hfV9k17X0zT59E",
  "complete_url": "/api/models/c0f4dcb9-7bee-4c5e-ae17-3d63aa43ac1b/testreports/ee4447ec-3719-4bb1-96c4-313335527a11/complete?key=2O57R13JR6hfV9k17X0zT59E"
}

Using this information, your testing script should be able to run successfully on the dataset chosen by the user.

Accepting settings from users

The info.json file generated when initiating the training of a model contains a settings section with values provided by the user who created in the model. These values are provided by the user via Criterion AI's web interface and, as a custom model developer, you get to decide what settings you need for your model. Criterion AI's web interface is powered by Vue.js and, to render the settings page, we rely on the module called vue-form-generator to lay out the fields that the user should fill out.

On the documentation site of vue-form-generator, you can learn more about the schema that vue-form-generator works with in order to render the settings page. The schema lets you render UI elements for settings that might be either strings, numbers, Boolean values, lists, etc. Follow the link below to learn about the possibilities that vue-form-generator offers:

You should design your schema as a JSON file and, once you create your custom model type in Criterion AI, you should provide the URL to the location of the JSON file. Upon rendering the settings page, Criterion AI will fetch your schema and render the UI according to your design.

Below, you can see an example of what a settings schema file might look like. Concretely, this is the settings schema for Criterion AI's image classification model.

{
  "fields": [{
    "type": "input",
    "inputType": "number",
    "label": "Image width",
    "model": "imageWidth",
    "default": 224,
    "required": true
  },
  {
    "type": "input",
    "inputType": "number",
    "label": "Image height",
    "model": "imageHeight",
    "default": 224,
    "required": true
  },
  {
    "type": "select",
    "label": "Type",
    "model": "type",
    "values": ["rgb", "grayscale"],
    "default": "rgb",
    "required": true
  },
  {
    "type": "input",
    "inputType": "number",
    "label": "By how many degrees can images be rotated?",
    "model": "rotationRange",
    "default": 0,
    "required": true
  },
  {
    "type": "checkbox",
    "label": "Can images be flipped horizontally?",
    "model": "horizontalSymmetry",
    "default": false,
    "required": true
  },
  {
    "type": "checkbox",
    "label": "Can images be flipped vertically?",
    "model": "verticalSymmetry",
    "default": false,
    "required": true
  },
  {
    "type": "select",
    "label": "Image format in production",
    "model": "img_format",
    "values": ["bmp", "jpeg", "png"],
    "default": "bmp",
    "required": true
  },]
}

The settings schema above will generate the UI seen in the screenshot below.

When a user creates a model and enters values for the settings your model requires, the settings will automatically be saved in Criterion AI and be included in the info.json file generated upon initiation of model training. From your training script, you will be able to read the settings and use them accordingly in your model.

Packaging your Python project

Once you have developed your training and testing script and defined your settings schema, you can make use of Python's built-in tools for packaging projects. From the root of your project folder, call:

  • python setup.py sdist

to generate a gzipped tarball containing the source of your custom model. This will generate a .tar.gz in the folder dist. You will have to upload this file to Google Cloud Storage so that Criterion AI can retrieve it when initiating the training or testing of a model.

Learn more about building packages for Cloud ML Engine in the section named Building your package manually in Google's documentation.

Setting up your custom model in Criterion AI

Once you have created the first version of your custom model and wrapped it into a Python package along with your settings schema, you can set up a model type referencing your custom model in Criterion AI. To learn how to set up a model type in Criterion AI, please see the next article in this series.

Still need help? Contact Us Contact Us