Main terms and concepts
Common
API token
An authentication token is required to access the REST API. You can obtain it in the interface on the _My space → API tokens page.
In an HTTP request, the API token is transmitted in the MLP-API-KEY
header, whereas in a gRPC request, it is transmitted in the authToken
parameter.
Fit method
The method is used for service training. It accepts a dataset and optionally a configuration as input, then performs fitting. Input data may include, for instance, examples of intents and their labels.
Fit configuration, training configuration
A list of parameters for fitting services with the fit
method.
When starting fitting, you can manually set the fit
configuration in JSON format or choose one from the list. The set of possible configurations is defined in the service settings.
Service | Examples of fit configurations |
---|---|
cross-validation-for-intents | hf_labse-linear, roberta-linear, roberta-metric, transformer-classifier-dl4j |
transformer-classifier-dl4j | roberta, roberta-big-batch-parallel |
The predict method
The method that is used for getting a prediction. It takes as input the body of a predict
request and optionally a configuration. You get a prediction as a response.
The input and output data format is specific to each particular service and is described in the service documentation.
Predict configuration, prediction configuration
A list of parameters for generating a prediction.
Task types
A type of ML task the service addresses: for example, text classification or typos correction. It determines the basic format of input and output data for the service. Services that perform tasks of the same type are interchangeable.
Service term
ML service, service (ML stands for Machine Learning)
A microservice aimed at solving ML tasks. Caila services are different from other microservices as they implement a strictly defined contract consisting of the fit
and predict
methods.
Services on the Caila platform implement various functions. It can be, for example, text classification or typo correction. Developers create services and publish them in the catalogue. Then the services are launched on the Caila servers. This makes them available for use via a web interface, API or SDK.
A service is a collection of program code (image), data (data image), configuration and deployment parameters, public visibility settings, and documentation. In other words, a service is an assembled, configured, and running program that is available for use by the developer themselves and other Caila users.
Service types
By readiness
Fittable service, basic service
A service that implements the fit training function. As a fitting result, a new service is created. Fittable services are used as generators of other services — fitted ones. Fittable services by themselves cannot be used to perform practical tasks.
Fitted service
A service that implements the predict
function. Fitted services are ready to perform a practical task, such as text classification or typos correction.
Ready-made service
A fitted service from the catalogue, also implements the predict
function. A ready-made service can be used without any setup to perform a practical task, like text classification or typos correction.
Derived service
A service that is a copy of some fittable service. After creation, such a service can be fitted independently of the service it was based on.
By the necessity to call other services
Plain service
A service that is used via the predict
method. All the data that is required for this service is contained in the image and/or passed in the service’s init
and predict
configurations.
Composite service
A service that makes calls to other services within its logic. A service developer should configure access to third-party services. For users of a composite service, it is no different from a plain one — it also performs some practical task.
Service configuration
The term can be used in two meanings: init
configuration and full service configuration.
Init service configuration
A JSON object passed to a service at startup. It is used to pass variables that relate to the logic of a service. For example, generation mode or number of iterations.
Full service configuration
Includes:
init
service configuration- environment variables (env variables)
- service image
- data images and mount parameters
- resource groups
- limits (restrictions on the resources use)
- other settings
Environment variables, env variables
Environment variables to start a service. They are used to transfer infrastructure variables, for example, server address, access password, video card number.
Service image
A Docker image with the service program code built using the SDK.
Data image
A Docker image with static resources that a service can use. For example, data images can contain neural network weights or other large datasets that can be changed independently of the service image.
There can be a single service image, for example, a transformer, but the data images can be different: they can be trained for different languages or have different sizes and resource requirements.
Resource groups
A compute resource pool that can be used to deploy a service. Resource groups can be either shared or dedicated to a specific account.
Limits
Resource limits allocated to a service instance: GPU, CPU, disk space, memory.
Fitting of services
Fitting process
Training a service on the input dataset and taking into account the fit
configuration. As a result of the fitting process, Caila saves a file or set of files to S3 storage, for example, a weight dump of a fitted service. In the future, you can start a new service with this dataset. The fitting process can take a long time.
Currently, Caila offers two fitting modes: singleFit and multiFit. The mode specifies how many containers will be deployed once the training begins.
singleFit, fitting mode
After the fitting starts, one container with a service will be deployed. The fit
and predict
methods will be executed sequentially in this container.
As soon as the fit
method starts executing, a new service is loaded into memory — updates its state. When the fit
method is executed, the service will become available for predict
requests.
multiFit, fitting mode
After fitting starts, two or more containers will be deployed. One will be used only for fitting: calling the fit
method, the others: only for calling the predict
logic. Containers for the predict
method will only become available after the fitting is completed.
After executing the fit
method, the fitted service is launched in a separate container. Upon starting this service, it is provided with a reference to files prepared during the execution of the fit
method.
Dataset
A text file that is uploaded to Caila and has a specific format. The data from this file is used for training the service.
Dataset data type
A data format to be specified when uploading a dataset in the Caila interface, for example, json/texts
, plain/texts
, json/transformer-fit
. Caila supports many types of data, as well as automatic conversion between them.