Skip to main content

OpenAI adapter

A special endpoint is implemented in Caila for use in applications that are already integrated with the OpenAI interface.

API

API available at: https://caila.io/api/adapters/openai

Supported methods:

  • chat/completion
  • completion
  • embedding
  • models

To access Caila services through the OpenAI adapter:

  • In the Authorization header, specify the API key created in Caila.
  • In the model field of the request, specify the model ID in the format: <author>/<service>[/model].
  • author is the first part of the model identifier. It’s the name of the service owner’s account (not your own, but the one who posted the service). For example just-ai.
  • service is the name of the service in Caila. For example, openai-proxy.
  • model is an optional part, defines the value of the “model” field, which will be sent in the request to the service. For example, gpt-4o.

Request examples

Qwen

curl https://caila.io/api/adapters/openai/chat/completions \
-H 'Authorization: <key from Caila>' \
-H 'Content-Type: application/json' \
-d '{"model":"just-ai/vllm-qwen1.5-32b-chat","messages":[{"role":"user","content":"Write a text of 100 words"}],"stream":true}'

Chat-GPT

curl https://caila.io/api/adapters/openai/chat/completions \
-H 'Authorization: <key from Caila>' \
-H 'Content-Type: application/json' \
-d '{"model":"just-ai/openai-proxy/gpt-3.5-turbo","messages":[{"role":"user","content":"Write a text of 100 words"}],"stream":true}'

Technical details

The OpenAI adapter operates through the same interfaces as Caila LLM API.

Upon receiving a request, the adapter parses the request body and splits it into ChatCompletionRequest and ChatCompletionConfig. The model, stream, and messages fields are placed in predict-request, and all other fields in predict-config, which are then sent to the service in this format. The response data from the service is passed within the gpt-adapter response as is.