Feature

Routing

Send each workload to the model and provider path that fits its cost, latency, and quality — and change that path from Concentrate when prices or performance shift.

Routing table

Workloads mapped to model slugs, provider paths, and backup routes.

Route

Workload

Signal

Cost + latency

Change

No app deploy

Summary

Low-cost route

Routine work moves to cheaper models.

Agent

Stronger model

Harder work keeps a higher-quality route.

Fallback

Backup path

Approved route is ready if primary fails.

New capabilities

What your team gains with Concentrate

Right model for each workload

Send cheap summaries to a small model and hard reasoning to a strong one, so you're not paying frontier prices for routine work or shipping weak output on the work that matters.

Change routes without a redeploy

Swap the provider or model behind a workload from config, so route logic lives in the gateway instead of being hard-coded into every app and CI pipeline.

Decide with your own numbers

Compare cost, latency, error rate, and output behavior per route using your real traffic, then move a workload when the data says so.

Who Concentrate is designed for

How workload-based LLM routing controls cost and quality

LLM routing is the practice of sending each request to the model and provider path chosen for that specific workload, instead of pointing every call at one default provider. A summary, a coding agent, a chat reply, and a data-extraction job have different cost, latency, and quality needs — routing lets each one use the route that fits, and lets you change that route from config when prices or models shift.

Cost follows the work

Routine, high-volume calls go to cheaper models while harder work keeps a stronger route, so spend tracks the value of the task instead of a single default.

Routes live in the gateway

Apps send one request shape to Concentrate. The model and provider behind it change through config, not through edits to every service.

Backed by fallbacks

Pair routing with fallbacks so a workload has a backup path ready when its primary provider degrades or fails.

Measured, not guessed

Usage analytics and request logs show cost, latency, and errors per route, so route changes are based on your traffic rather than a vendor benchmark.

Feature basics

Frequently asked questions

What is LLM routing?

LLM routing sends each request to the model and provider path chosen for that workload, instead of sending all work to one default provider. It lets you match cost, latency, and quality to the job — and change the route later without touching app code.

How should teams choose routes?

Choose routes by workload using output quality, latency, token cost, provider availability, and failure behavior. Start from your own usage analytics and request logs so the decision reflects real traffic rather than a published benchmark.

Does routing require changing application code?

No. Apps send one request shape to the gateway, and routing rules decide the model and provider behind it. Swapping a workload to a cheaper model or a different provider is a config change, not a redeploy of every service.

How workload-based LLM routing controls cost and quality

Routing

What your team gains with Concentrate

Right model for each workload

Change routes without a redeploy

Decide with your own numbers

How workload-based LLM routing controls cost and quality

Cost follows the work

Routes live in the gateway

Backed by fallbacks

Measured, not guessed

Frequently asked questions

LLM Gateway

Teams

Integrations

Platform

Legal

Routing

What your team gains with Concentrate

Right model for each workload

Change routes without a redeploy

Decide with your own numbers

How workload-based LLM routing controls cost and quality

Cost follows the work

Routes live in the gateway

Backed by fallbacks

Measured, not guessed

Frequently asked questions

LLM Gateway

Teams

Integrations

Platform

Legal