OpenAI adapter
A special endpoint is implemented in Caila for use in applications that are already integrated with the OpenAI interface.
API
API available at: https://caila.io/api/adapters/openai
Supported methods:
- chat/completion
- completion
- embedding
- models
To access Caila services through the OpenAI adapter:
- In the Authorization header, specify the API key created in Caila.
- In the model field of the request, specify the model ID in the format:
<author>/<service>[/model]
. - author is the first part of the model identifier. It’s the name of the service owner’s account (not your own, but the one who posted the service). For example just-ai.
- service is the name of the service in Caila. For example, openai-proxy.
- model is an optional part, defines the value of the “model” field, which will be sent in the request to the service. For example, gpt-4o.
Request examples
Qwen
curl https://caila.io/api/adapters/openai/chat/completions \
-H 'Authorization: <key from Caila>' \
-H 'Content-Type: application/json' \
-d '{"model":"just-ai/vllm-qwen1.5-32b-chat","messages":[{"role":"user","content":"Write a text of 100 words"}],"stream":true}'
Chat-GPT
curl https://caila.io/api/adapters/openai/chat/completions \
-H 'Authorization: <key from Caila>' \
-H 'Content-Type: application/json' \
-d '{"model":"just-ai/openai-proxy/gpt-3.5-turbo","messages":[{"role":"user","content":"Write a text of 100 words"}],"stream":true}'
Technical details
The OpenAI adapter operates through the same interfaces as Caila LLM API.
Upon receiving a request, the adapter parses the request body and splits it into ChatCompletionRequest
and ChatCompletionConfig
.
The model
, stream
, and messages
fields are placed in predict-request
, and all other fields in predict-config
, which are then sent to the service in this format.
The response data from the service is passed within the gpt-adapter response as is.