This service supports two API-key modes. The mode is stored on the API key itself and controls how requests are handled.
Open Register, create a username and password, then sign in.
In the dashboard open Create API Key and choose one of the modes below.
| Mode | Use it for | Main endpoint |
|---|---|---|
emulator | Codex and GPT-5.4 emulation | /v1/responses |
bypass | Raw Ollama-compatible chat requests | /v1/chat/completions |
If you want the service to use your own Ollama Cloud account, add one or more Ollama site API keys in Ollama Cloud Keys.
Then choose:
If no Ollama key/model is selected, the service falls back to the shared upstream configured on the server.
For every Ollama key the dashboard shows:
Spent in 7 daysSpent in 5 hoursLast checkedStatusThose values are cached and are not refreshed more often than once per 10 minutes per key.
If the preferred Ollama key is exhausted, the service automatically tries the next non-exhausted key belonging to the same user.
Use an emulator key when your client expects GPT-5.4-style Responses API behavior.
curl -sS https://llm.chat-artin.ru/v1/responses \
-H 'Authorization: Bearer <your-emulator-key>' \
-H 'Content-Type: application/json' \
-d '{
"model": "gpt-5.4",
"input": [
{
"type": "message",
"role": "user",
"content": [{"type": "input_text", "text": "Reply with hello"}]
}
],
"stream": true
}'
Use a bypass key when you want raw Ollama-compatible chat/completions responses without GPT-5.4 emulation.
curl -sS https://llm.chat-artin.ru/v1/chat/completions \
-H 'Authorization: Bearer <your-bypass-key>' \
-H 'Content-Type: application/json' \
-d '{
"model": "qwen3-coder-next:cloud",
"messages": [
{"role": "user", "content": "Reply with bypass-ok"}
]
}'
GET /healthz - service health checkGET /v1/models - models list for the current key modePOST /v1/responses - GPT-5.4 emulator pathPOST /v1/chat/completions - raw bypass path