API Compatibility
Gonka Broker API keys provide OpenAI-compatible access through the proxy. This page documents what is currently supported.
Base URL
Section titled “Base URL”https://proxy.gonkabroker.com/v1Supported endpoints
Section titled “Supported endpoints”| Endpoint | Status |
|---|---|
POST /chat/completions | Supported |
POST /completions | Supported (legacy) |
GET /models | Supported |
GET /test-auth | Supported — returns key status and current rate limit |
Chat Completions parameters
Section titled “Chat Completions parameters”The following parameters are supported in /v1/chat/completions requests:
| Parameter | Supported |
|---|---|
model | Yes |
messages | Yes |
temperature | Yes |
top_p | Yes |
max_tokens | Yes |
stream | Yes |
stop | Yes |
presence_penalty | Yes |
frequency_penalty | Yes |
tools | Yes |
tool_choice | Yes |
thinking | Yes (model-dependent) |
Thinking (extended reasoning)
Section titled “Thinking (extended reasoning)”Models that support extended thinking (such as Qwen3) accept the thinking parameter:
{ "model": "Qwen/Qwen3-235B-A22B-Instruct-2507-FP8", "messages": [{"role": "user", "content": "Solve this step by step: 23 * 47"}], "thinking": {"type": "enabled", "budget_tokens": 4096}}To disable thinking on models that enable it by default:
{ "thinking": {"type": "disabled"}}Whether thinking is supported depends on the specific model. The parameter is passed through to the network as-is.
Message content format
Section titled “Message content format”The content field in messages supports both formats:
- String — plain text value (
"content": "Hello") - Array — structured content parts (
"content": [{"type": "text", "text": "Hello"}])
Both formats are fully supported. However, only text content parts are available — image and other multimodal content types are not supported.
Response format
Section titled “Response format”Responses follow the OpenAI Chat Completions response format:
id— unique response identifierobject—"chat.completion"choices— array of completion choicesusage— token usage statistics (prompt_tokens,completion_tokens,total_tokens)
Streaming responses use Server-Sent Events (SSE), matching the OpenAI streaming format. The final chunk of every stream includes a usage object with token counts.
Request processing
Section titled “Request processing”The proxy applies the following processing to your requests:
- Standard OpenAI defaults are applied for omitted parameters (e.g.,
temperature: 0.7) max_tokensis clamped to the model’s maximum output length- Multimodal content (image inputs) is not supported — only
textcontent parts are accepted
Not yet supported
Section titled “Not yet supported”The following OpenAI features are not currently available:
- Responses API (
/v1/responses) - Embeddings API
- Images API
- Audio API (TTS, STT)
- Assistants API
- Fine-tuning API
- Vision (image inputs)
- JSON mode / structured outputs
These may be added in future releases. Check back for updates.