服务指南

本服务支持两种 API key 模式。模式保存在 API key 本身中，并决定请求如何被处理。

1. 创建账户

打开注册页面，创建用户名和密码，然后登录。

2. 创建 API key

在 dashboard 中打开创建 API key 区域，并选择下面的一种模式。

模式	用途	主要 endpoint
`emulator`	Codex 和 GPT-5.4 模拟	`/v1/responses`
`bypass`	原始 Ollama 兼容 chat 请求	`/v1/chat/completions`

3. 可选：添加你自己的 Ollama Cloud key

如果你希望服务使用你自己的 Ollama Cloud 账户，请在 Ollama Cloud Keys 中添加一个或多个 Ollama site API key。

然后选择：

当前优先使用哪个 Ollama key
这个 key 应该搭配哪个 Ollama 云模型

如果没有选择 Ollama key 或模型，服务会回退到服务器上配置的共享 upstream。

4. 资源显示与自动切换

对每个 Ollama key，dashboard 会显示：

7 天内已用
5 小时内已用
上次检查
状态

这些值会被缓存，并且每个 key 10 分钟内最多刷新一次。

如果首选 Ollama key 已耗尽，服务会自动尝试该用户的下一个未耗尽 key。

使用 emulator 模式

当你的客户端期望 GPT-5.4 风格的 Responses API 行为时，请使用 emulator key。

curl -sS https://llm.chat-artin.ru/v1/responses \
  -H 'Authorization: Bearer <your-emulator-key>' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gpt-5.4",
    "input": [
      {
        "type": "message",
        "role": "user",
        "content": [{"type": "input_text", "text": "Reply with hello"}]
      }
    ],
    "stream": true
  }'

使用 bypass 模式

如果你想直接获得原始 Ollama-compatible chat/completions 响应而不经过 GPT-5.4 模拟，请使用 bypass key。

curl -sS https://llm.chat-artin.ru/v1/chat/completions \
  -H 'Authorization: Bearer <your-bypass-key>' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "qwen3-coder-next:cloud",
    "messages": [
      {"role": "user", "content": "Reply with bypass-ok"}
    ]
  }'

模式规则

emulator key 用于 /v1/responses
bypass key 用于 /v1/chat/completions
如果 endpoint 与 key 模式不匹配，服务会返回明确的错误信息

可用 endpoints

GET /healthz - 服务健康检查
GET /v1/models - 当前 key 模式对应的模型列表
POST /v1/responses - GPT-5.4 模拟路径
POST /v1/chat/completions - 原始 bypass 路径