Skip to main content
POST
/
v1
/
chat
/
completions
/
cURL
curl --request POST \
  --url https://api.concentrate.ai/v1/chat/completions/ \
  --header 'Content-Type: application/json' \
  --data '
{
  "messages": [
    {
      "content": "<string>",
      "name": "<string>"
    }
  ],
  "model": "<string>",
  "frequency_penalty": 0,
  "functions": [
    {
      "name": "<string>",
      "parameters": {},
      "description": "<string>"
    }
  ],
  "logit_bias": {},
  "logprobs": true,
  "max_completion_tokens": 1,
  "max_tokens": 1,
  "metadata": {},
  "modalities": [],
  "n": 64,
  "parallel_tool_calls": true,
  "prediction": {
    "type": "content",
    "content": "<string>"
  },
  "presence_penalty": 0,
  "prompt_cache_key": "<string>",
  "response_format": "<unknown>",
  "safety_identifier": "<string>",
  "seed": "<string>",
  "stop": "<string>",
  "store": true,
  "stream": true,
  "stream_options": {
    "include_obfuscation": true,
    "include_usage": true
  },
  "temperature": 1,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {},
        "strict": false
      }
    }
  ],
  "top_logprobs": 10,
  "top_p": 0.5,
  "user": "<string>",
  "web_search_options": {
    "user_location": {
      "type": "approximate",
      "approximate": {
        "timezone": "<string>",
        "country": "<string>",
        "city": "<string>",
        "region": "<string>"
      }
    }
  }
}
'
{
  "id": "<string>",
  "model": "<string>",
  "created": 1,
  "object": "<string>",
  "choices": [
    {
      "index": 1,
      "message": {
        "role": "assistant",
        "content": "<string>",
        "reasoning_content": "<string>",
        "reasoning": "<string>",
        "tool_calls": [
          {
            "type": "function",
            "id": "<string>",
            "function": {
              "name": "<string>",
              "arguments": "<string>"
            }
          }
        ]
      },
      "logprobs": {
        "content": [],
        "refusal": []
      }
    }
  ],
  "usage": {
    "completion_tokens": 1,
    "prompt_tokens": 1,
    "total_tokens": 1,
    "prompt_tokens_details": {
      "cached_tokens": 1,
      "audio_tokens": 1
    },
    "completion_tokens_details": {
      "reasoning_tokens": 1,
      "accepted_prediction_tokens": 1,
      "audio_tokens": 1,
      "rejected_prediction_tokens": 1
    }
  },
  "cost": {
    "total": 123
  },
  "metadata": {}
}
Beta FeatureThe Chat Completions API is currently in beta. It provides OpenAI Chat Completions API compatibility for clients like Cursor, Opencode, and other tools that use the OpenAI format. For production use, we recommend using the Responses API instead.

Overview

OpenAI Chat Completions API compatibility endpoint. Use Concentrate as a drop-in replacement for OpenAI in any tool or SDK that supports configurable base URLs.

Body

application/json
messages
object[]
required
Minimum array length: 1
model
string
required

Model identifier. Use /v1/models to list all available models. Supports canonical names (e.g. gpt-5.2, claude-opus-4-6), aliases, and provider-prefixed formats (e.g. openai/gpt-5.2). Use "auto" for automatic model selection.

frequency_penalty
number | null
Required range: -2 <= x <= 2
function_call
Available options:
none,
auto
functions
object[] | null
logit_bias
object
logprobs
boolean | null
max_completion_tokens
number | null
Required range: x > 0
max_tokens
number | null
Required range: x > 0
metadata
object

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.

modalities
enum<string>[] | null
Required array length: 1 - 2 elements
Available options:
text,
audio
n
integer | null
Required range: 1 <= x <= 128
parallel_tool_calls
boolean | null
prediction
object
presence_penalty
number | null
Required range: -2 <= x <= 2
prompt_cache_key
string | null
prompt_cache_retention
enum<string> | null

The retention policy for the prompt cache. Set to 24h to enable extended prompt caching, which keeps cached prefixes active for longer, up to a maximum of 24 hours. Has no effect on explicit caching, which must be set through cache_control.

Available options:
in-memory,
in_memory,
24h
reasoning_effort
enum<string> | null

Constrains effort on reasoning for reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Not all models support all reasoning levels. If your requested reasoning level isn't supported by the model, Concentrate bumps it up, then down, to the closest reasoning level.

Available options:
none,
minimal,
low,
medium,
high,
xhigh
response_format
safety_identifier
string | null
seed
string | null
service_tier
enum<string> | null

Specifies the processing type used for serving the request. Determines the pricing and performance tier used to process the request. When not set, the default behavior is auto. Currently unsupported, but included for compatibility.

Available options:
auto,
default,
flex,
scale,
priority
stop
store
boolean | null
stream
boolean | null
stream_options
object
temperature
number | null
Required range: 0 <= x <= 2
tool_choice

Controls which (if any) tool is called by the model. none means the model will not call any tool. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

Available options:
none,
auto,
required
tools
object[] | null
top_logprobs
integer | null
Required range: 0 <= x <= 20
top_p
number | null
Required range: 0 <= x <= 1
user
string | null
verbosity
enum<string> | null

Constrains the verbosity of the model's response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are low, medium, and high.

Available options:
low,
medium,
high
web_search_options
object

Response

Default Response

id
string
required
model
string
required
created
number
required
Required range: x >= 0
object
string
required
choices
object[]
required
usage
object
required
cost
object
required
metadata
object

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.

service_tier
enum<string> | null

Specifies the processing type used for serving the request. Determines the pricing and performance tier used to process the request. When not set, the default behavior is auto. Currently unsupported, but included for compatibility.

Available options:
auto,
default,
flex,
scale,
priority