Skip to main content

Documentation Index

Fetch the complete documentation index at: https://concentrate.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Welcome to Concentrate AI

The Concentrate AI Responses API provides a unified interface for interacting with multiple AI model providers. Access GPT-5.4, Claude Opus 4.6, Gemini 3.1, and 120+ other models through a single, normalized API with automatic routing and credit tracking.

Quickstart

Get started with your first API request in minutes

API Reference

View detailed endpoint documentation

Claude Code Setup

Use Claude Code with any model in our model fortress

Cursor Setup

Use models on Concentrate in Cursor

Key Features

One API format works across all providers. No need to learn different request/response formats for OpenAI, Anthropic, Google, or other providers.
Use model: "auto" to automatically select the best model based on cost, performance, or latency. The API intelligently routes your requests based on real-time metrics.
Access models from OpenAI, Anthropic, Google Vertex, AWS Bedrock, Azure, xAI, Cohere, Mistral, Cloudflare, and Hugging Face through a single endpoint.
Enable real-time streaming via Server-Sent Events (SSE) for a responsive user experience. Works consistently across all providers.
Built-in usage tracking and billing integration. Monitor token usage, costs, and set spending limits.

Authentication

All API requests require authentication using an API key. Get your API key from the Concentrate.ai dashboard. Include your API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
Keep your API key secure. Never share it publicly or commit it to version control.

Base URL

https://api.concentrate.ai/v1

Supported Models

The API supports 70+ models across multiple providers:
  • GPT-5.4, GPT-5.3 Codex, GPT-5.2, GPT-5.1
  • GPT-5.1 Codex Max, GPT-5.1 Codex Mini
  • GPT-5 Mini, GPT-5 Nano
  • GPT-4.1, GPT-4o, GPT-4o Mini
  • o1 (reasoning model)
Check the Model Fortress page in the app for complete listings and current rates.

Model Selection

You can specify models in three ways:
  1. Model name only: "gpt-5.4" - Automatic provider routing
  2. Provider prefix: "openai/gpt-5.4" - Specific provider
  3. Auto routing: "auto" - Let the API choose based on your criteria
{
  "model": "gpt-5.4",
  "input": "Hello, world!"
}

Quick Example

curl https://api.concentrate.ai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-5.4",
    "input": "What is the capital of France?"
  }'

Response Format

All responses follow a normalized format regardless of provider:
{
  "id": "resp_abc123",
  "created_at": 1702934400,
  "status": "completed",
  "model": "openai/gpt-5.4",
  "output": [
    {
      "type": "message",
      "id": "msg_xyz789",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "The capital of France is Paris."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 12,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 8,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 20
  }
}

Error Handling

The API uses standard HTTP status codes:
Status CodeDescription
200Successful request
400Bad request - Invalid parameters
401Unauthorized - Invalid API key
402Payment required - Insufficient credits
424Failed dependency - Provider error
500Internal server error

View Error Examples

See detailed error response formats and troubleshooting

Rate Limits

Rate limits are applied per API key and are based on your subscription tier. Limits are enforced using token bucket algorithm with per-minute windows.
Contact support to increase your rate limits or discuss enterprise pricing.

Next Steps

Quickstart Guide

Make your first API call

Create Response

Full endpoint documentation

Streaming

Learn about streaming responses

Auto Routing

Automatic model selection