Service settings
Parameter | Description |
---|---|
Launch configuration | A JSON object passed to a service at startup. It is used to pass variables that relate to the logic of a service. For example, generation mode or number of iterations. |
Environment variables | Environment variables to start a service. They are used to pass infrastructure variables, such as server address, access password, graphics card number. |
Description | Description that will be displayed on the service card in the catalogue. |
Supported languages | Languages supported by the service. If there are multiple languages, please specify each one separately. |
Fittable | Indicates whether the service will be fittable or not. If you enable this option, please select the training type. The type determines the number of containers to be deployed after training starts: • singleFit One container will be deployed with your service. The fit and predict methods will be executed in this container. • multiFit Two or more containers will be deployed. One will be used only for fitting: calling the fit method, the others: only for calling the predict logic. Containers for the predict method will only become available after the fitting is completed. |
Composite | Indicates whether the service is composite or simple: • A simple service is accessed via the predict method. All the data required for the operation of such a service is contained within the image or passed through configurations. • A composite service makes calls to other services within its logic. |
Task type | Type of the task to be solved. Select Misc or Other if other options do not apply. |
Timeouts | • Pod start timeout is a timeout for starting a service instance. • Predict timeout is a timeout for executing the predict request for the service. |
Data images | Docker images with static resources that a service can use. For example, neural network weights or other large datasets that can be changed independently of the service image. To select a data image from the list, it must first be added to Caila. In the Where field, specify the directory where the files will be mounted into the service container. |
Resource group | A set of servers allocated for running ML services for specific accounts. Resource groups can be either shared or dedicated to a specific account. Select one of the available resource groups or leave the default value. |
Resource limits | Resource limits allocated to a service instance: GPU, CPU, disk space, memory. |
Retries configuration | Parameters for resending requests to the service instance. A request may be resent if: 1) an error occurred during the request sending to the service instance, or 2) the service instance does not respond within the specified time. Specify how many times to send a retry request, as well as the response timeouts from the service instance in JSON object format. |
Batches configuration | They allow to set the maximum number of requests that will be sent to the service and the time during which requests will be accumulated. |