Question 1

What is an LLM gateway?

Accepted Answer

It is the endpoint your app calls instead of calling each model provider directly. Requests still reach OpenAI, Anthropic, Google, Azure, and other providers — Concentrate manages those relationships for you: spend commitments, normalized API surface, keys, routes, fallbacks, and logs.

Your product still owns prompts, inputs, response parsing, and user-facing behavior. Concentrate handles the provider work around each request so your team is not maintaining separate provider integrations in every app.

Example

A support bot keeps calling one endpoint while you move its summary task from GPT to Claude or Gemini without changing the bot code.

Question 2

What changes in our code?

Accepted Answer

For most features, three things: API key, base URL, and model name. Keep the same prompt construction and response handling. Send the request to https://api.concentrate.ai/v1 and use CNAI_API_KEY instead of provider keys.

Do not move business logic into the gateway. Keep product decisions in your app. Move provider setup, failover decisions, spend limits, request logs, and redaction checks out of each feature.

Example

A TypeScript route can keep using an OpenAI-compatible client while swapping apiKey, baseURL, and model. The rest of the handler can stay boring.

Question 3

When should we add a gateway instead of calling providers directly?

Accepted Answer

Direct provider calls are fine for a prototype or one internal script. A gateway starts paying for itself when the same provider work shows up in multiple places: keys in several services, fallback branches in app code, separate provider logs, or finance asking which team spent the tokens.

Add one when spend starts to ramp up, when you need reliability across providers, or before a security review. Security teams usually ask where prompts go, which keys can call which models, how PII is handled, and whether request logs exist.

Example

If support, sales, and internal tools all use AI, they should not each own separate provider keys and logging code.

Question 4

Can we keep using the SDK we already have?

Accepted Answer

Usually yes. The point is to keep the app-side request familiar. Many teams keep their OpenAI-compatible client, change the base URL, and send the same kind of request through Concentrate. See our API docs for request shape and examples.

If your feature depends on provider-specific behavior, test that route before moving all traffic. Things like streaming shape, tool-call payloads, JSON mode, file inputs, and image inputs should be checked per workload instead of assumed.

Example

Move a low-risk summary endpoint first. Compare response shape, latency, token counts, and error handling before moving chat or agent traffic.

Question 5

How do fallbacks work without adding branches in our app?

Accepted Answer

Your app sends one request. The fallback policy lives outside the feature code. If the primary provider route fails, times out, or hits a rule you define, Concentrate can send the request to another configured route.

This keeps application code from filling up with provider-specific try/catch branches. Engineers can change the backup model or provider route without deploying every service that calls AI.

Example

Keep the support-summary endpoint pointed at one model name while changing the backup route from Anthropic direct to Claude on Azure.

Question 6

How does logging work?

Accepted Answer

Concentrate can log each request with model, provider route, key, team, status, latency, token counts, estimated spend, fallback use, and redaction results. Logging is opt in per your policy, and we handle storage — you do not run a separate logging pipeline for every provider.

Logs are normalized across providers in one place, so engineers can debug failures, compare routes, and review spend without stitching together OpenAI, Anthropic, and Google dashboards.

Example

When latency jumps, filter one log view by model and provider to see whether the request used the primary route, hit a fallback, or waited on a timeout.

Question 7

How should we test the migration?

Accepted Answer

Start with one feature and replay known prompts against the old provider path and the Concentrate path. Compare status codes, response fields, latency, token counts, and the output your users actually see.

Then test failure cases. Force a bad key, a provider timeout, a blocked input, and a fallback. The migration is only done when the boring path and the failure path both behave the way your app expects.

Example

Use a saved set of support tickets, run them through both paths, and review output diffs before changing traffic for the live support bot.

Instant access to every model

Implement in three easy steps

Change the base URL

Move provider logic out

Send your first request

What Concentrate manages for you

Provider keys

Provider-specific setup

Fallback logic

Logs and cost tracking

Frequently asked questions

Start with one app

LLM Gateway

Teams

Integrations

Platform

Legal

Instant access to every model

Implement in three easy steps

Change the base URL

Move provider logic out

Send your first request

What Concentrate manages for you

Provider keys

Provider-specific setup

Fallback logic

Logs and cost tracking

Frequently asked questions

Start with one app

LLM Gateway

Teams

Integrations

Platform

Legal