Introduction to the developer APIs
Criterion AI is an extensible platform that allows you to make use of Criterion AI's infrastructure for your own custom models. In these articles, you will learn how the Criterion AI platform is structured, how you can leverage the platform and how to get started writing your very own model.
Criterion AI is based on Google Cloud Platform and makes use of two key services for storing data, training models and hosting artifacts: Google Cloud Storage and Google Cloud ML Engine. That means that you, as a model developer, will mainly be interacting with these two services when developing custom models for Criterion AI. It also means that most of the documentation available on these two services from Google will be relevant for you. Read more about Google Cloud Storage and Google Cloud ML Engine on Google's documentation site.
Criterion AI itself mostly consists of a web application (hosted in Google App Engine) and a relational database (in Google Cloud SQL), which primarily stores and serves a lot of metadata on users' datasets, models and deployments. As a custom model developer, you will be able to access and update the relevant metadata in Criterion AI through a set of RESTful APIs, which you will learn about in this series of articles.
Custom models as Python packages
Generally, the custom models you create will run as Python packages in Google Cloud ML Engine. A custom model will have access to a job bucket in Google Cloud Storage where it can store artifacts, serialized model files, temporary files and more. It will also have access to report some information back to Criterion AI (such as statistics from model training progress, Vega-Lite charts, files to generate a Facets Dive view, serialized model files and more). Your custom model will have to make use of the RESTful APIs (along with a secret API key unique to the model) to report back this information.
In addition to being able to report back information, your model will be fed with settings chosen by the user via a file (called
info.json), which is generated by Criterion AI placed in the model's job bucket in Google Cloud Storage. This file can include settings that your model may require (as defined by you), the name and ID of the model, pointers to the datasets the model should be trained on, etc. Normally, the first step of a custom model would then be to fetch and parse this file to extract the settings and other variables that model will need to run the training.
All of the custom models hosted in Criterion AI are Python packages stored as gzipped tarballs, which Python's standard format for distributing of packages. By default, you will be able to package up your solution by calling
python setup.py sdist from the root of your package folder. The resulting
.tar.gz file from the package command should include all of the code and assets required for your model to run.
To learn how to get started with developing a custom model, go to the next article in this section: Structuring a Python package to run in Google Cloud ML Engine. After having read that, check out the subsequent articles in the Developers section. Through those articles, you will learn about all of the aspects of custom model development in Criterion AI.