Main terms and concepts

Common

API token

An authentication token is required to access the REST API. You can obtain it in the interface on the _My space → API tokens page.

In an HTTP request, the API token is transmitted in the MLP-API-KEY header, whereas in a gRPC request, it is transmitted in the authToken parameter.

Fit method

The method is used for service training. It accepts a dataset and optionally a configuration as input, then performs fitting. Input data may include, for instance, examples of intents and their labels.

Fit configuration, training configuration

A list of parameters for fitting services with the fit method.

When starting fitting, you can manually set the fit configuration in JSON format or choose one from the list. The set of possible configurations is defined in the service settings.

Service	Examples of fit configurations
cross-validation-for-intents	hf_labse-linear, roberta-linear, roberta-metric, transformer-classifier-dl4j
transformer-classifier-dl4j	roberta, roberta-big-batch-parallel

The predict method

The method that is used for getting a prediction. It takes as input the body of a predict request and optionally a configuration. You get a prediction as a response.

The input and output data format is specific to each particular service and is described in the service documentation.

Predict configuration, prediction configuration

A list of parameters for generating a prediction.

Task types

A type of ML task the service addresses: for example, text classification or typos correction. It determines the basic format of input and output data for the service. Services that perform tasks of the same type are interchangeable.

Service term

ML service, service (ML stands for Machine Learning)

A microservice aimed at solving ML tasks. Caila services are different from other microservices as they implement a strictly defined contract consisting of the fit and predict methods.

Services on the Caila platform implement various functions. It can be, for example, text classification or typo correction. Developers create services and publish them in the catalogue. Then the services are launched on the Caila servers. This makes them available for use via a web interface, API or SDK.

A service is a collection of program code (image), data (data image), configuration and deployment parameters, public visibility settings, and documentation. In other words, a service is an assembled, configured, and running program that is available for use by the developer themselves and other Caila users.

Creating a service

Service types

By readiness

Fittable service, basic service

A service that implements the fit training function. As a fitting result, a new service is created. Fittable services are used as generators of other services — fitted ones. Fittable services by themselves cannot be used to perform practical tasks.

Fitted service

A service that implements the predict function. Fitted services are ready to perform a practical task, such as text classification or typos correction.

Ready-made service

A fitted service from the catalogue, also implements the predict function. A ready-made service can be used without any setup to perform a practical task, like text classification or typos correction.

Derived service

A service that is a copy of some fittable service. After creation, such a service can be fitted independently of the service it was based on.

By the necessity to call other services

Plain service

A service that is used via the predict method. All the data that is required for this service is contained in the image and/or passed in the service’s init and predict configurations.

Composite service

A service that makes calls to other services within its logic. A service developer should configure access to third-party services. For users of a composite service, it is no different from a plain one — it also performs some practical task.

Service configuration

The term can be used in two meanings: init configuration and full service configuration.

Init service configuration

A JSON object passed to a service at startup. It is used to pass variables that relate to the logic of a service. For example, generation mode or number of iterations.

Full service configuration

Includes:

init service configuration
environment variables (env variables)
service image
data images and mount parameters
resource groups
limits (restrictions on the resources use)
other settings

Environment variables, env variables

Environment variables to start a service. They are used to transfer infrastructure variables, for example, server address, access password, video card number.

Service image

A Docker image with the service program code built using the SDK.

Data image

A Docker image with static resources that a service can use. For example, data images can contain neural network weights or other large datasets that can be changed independently of the service image.

There can be a single service image, for example, a transformer, but the data images can be different: they can be trained for different languages or have different sizes and resource requirements.

Resource groups

A compute resource pool that can be used to deploy a service. Resource groups can be either shared or dedicated to a specific account.

Limits

Resource limits allocated to a service instance: GPU, CPU, disk space, memory.

Fitting of services

Fitting process

Training a service on the input dataset and taking into account the fit configuration. As a result of the fitting process, Caila saves a file or set of files to S3 storage, for example, a weight dump of a fitted service. In the future, you can start a new service with this dataset. The fitting process can take a long time.

Currently, Caila offers two fitting modes: singleFit and multiFit. The mode specifies how many containers will be deployed once the training begins.

singleFit, fitting mode

After the fitting starts, one container with a service will be deployed. The fit and predict methods will be executed sequentially in this container.

As soon as the fit method starts executing, a new service is loaded into memory — updates its state. When the fit method is executed, the service will become available for predict requests.

multiFit, fitting mode

After fitting starts, two or more containers will be deployed. One will be used only for fitting: calling the fit method, the others: only for calling the predict logic. Containers for the predict method will only become available after the fitting is completed.

After executing the fit method, the fitted service is launched in a separate container. Upon starting this service, it is provided with a reference to files prepared during the execution of the fit method.

Dataset

A text file that is uploaded to Caila and has a specific format. The data from this file is used for training the service.

Dataset data type

A data format to be specified when uploading a dataset in the Caila interface, for example, json/texts, plain/texts, json/transformer-fit. Caila supports many types of data, as well as automatic conversion between them.

Main terms and concepts

Common​

API token​

Fit method​

Fit configuration, training configuration​

The predict method​

Predict configuration, prediction configuration​

Task types​

Service term​

ML service, service (ML stands for Machine Learning)​

Service types​

By readiness​

By the necessity to call other services​

Service configuration​

Fitting of services​