Managing data via FTP
In addition to uploading data to your dataset via a web browser, you can manage your data using your favorite FTP client. Criterion AI exposes all datasets via FTPS over TLS (explicit encryption) to all editors and owners of datasets as well as global administrators of organizations. Compared to the web interface, the FTP interface makes it a lot easier to upload large amounts of data, move data around, delete data, integrate datasets with third-party systems, etc. This article explains how to get started with interacting with datasets through FTP.
First of all, you will need an FTP client that supports FTPS over TLS, which is an enterprise-grade protocol for securely transferring data. There are many such clients, which you can download for free, including Cyberduck, FileZilla, Free FTP or WinSCP. In this article, we will be making use of Cyberduck.
After you have installed your FTP client, navigate to Criterion AI and locate the dataset you wish to manage. In the example below, we have found the dataset called NeurIPS 2018 Batch HW0001. You must be either an Editor or an Owner of the dataset you wish to manage or a global administrator in order to interact with it via FTP.
Given you have one of the required roles, you will be able to see the FTP connection info, which you need to log into your dataset from your FTP client. Click on the Manage data link to the top right of the preview of the files and folders in the dataset and select Show FTP connection info to see the credentials you need.
Copy each of the three values (the hostname, the username and the password) into your FTP client.
In the example below, we are pasting the credentials from Criterion AI into our FTP client, Cyberduck. In most clients, you can store bookmarks with the credentials so you don't have to copy and paste the information every time you wish to manage your dataset.
Note that the password you have been given is unique to you so you should not share it with colleagues. If you wish to collaborate with your colleagues on a dataset, they should have their own user accounts in Criterion AI and obtain their own unique password for the dataset. Your password is associated with the roles and permissions you have been given, which means that if you lose permission to a specific dataset in Criterion AI, you will not be able to interact with it any longer via FTP.
After having created the bookmark, you can connect to the FTP server by double-clicking on it in the overview. Provide the password you copied from Criterion AI.
After connecting, you will be able to see an overview of all the files in your dataset. From here, you can upload more data, move data around, create new folders, delete folders, etc. The only feature that is not supported is the movement of entire directories. If you wish to move data around, you must do so by moving the files themselves and not their containing folders.
Your connection to Criterion AI is encrypted via an enterprise-grade security protocol (FTPS over TLS). In Cyberduck, you can verify that the connection is indeed encrypted by clicking on the padlock in the bottom right-hand corner and inspecting the certificate our server (ftp.criterion.ai) presents to you.
This concludes the article on managing data via FTP.