Chat Completions
OpenAI Chat Completions API compatibility endpoint for Cursor, Opencode, and other clients
Overview
OpenAI Chat Completions API compatibility endpoint. Use Concentrate as a drop-in replacement for OpenAI in any tool or SDK that supports configurable base URLs.Body
1- Option 1
- Option 2
- Option 3
- Option 4
Model identifier. Use /v1/models to list all available models. Supports canonical names (e.g. gpt-5.2, claude-opus-4-6), aliases, and provider-prefixed formats (e.g. openai/gpt-5.2). Use "auto" for automatic model selection.
-2 <= x <= 2none, auto x > 0x > 0Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
1 - 2 elementstext, audio 1 <= x <= 128-2 <= x <= 2The retention policy for the prompt cache. Set to 24h to enable extended prompt caching, which keeps cached prefixes active for longer, up to a maximum of 24 hours. Has no effect on explicit caching, which must be set through cache_control.
in-memory, in_memory, 24h Constrains effort on reasoning for reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Not all models support all reasoning levels. If your requested reasoning level isn't supported by the model, Concentrate bumps it up, then down, to the closest reasoning level.
none, minimal, low, medium, high, xhigh Specifies the processing type used for serving the request. Determines the pricing and performance tier used to process the request. When not set, the default behavior is auto. Currently unsupported, but included for compatibility.
auto, default, flex, scale, priority 0 <= x <= 2Controls which (if any) tool is called by the model. none means the model will not call any tool. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.
none, auto, required - Option 1
- Option 2
0 <= x <= 200 <= x <= 1Constrains the verbosity of the model's response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are low, medium, and high.
low, medium, high Response
Default Response
x >= 0Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
Specifies the processing type used for serving the request. Determines the pricing and performance tier used to process the request. When not set, the default behavior is auto. Currently unsupported, but included for compatibility.
auto, default, flex, scale, priority