Multi-Modal Inputs

Overview

The Concentrate AI API supports multi-modal inputs, allowing you to send images alongside text to vision-capable models. Images can be provided as base64 data URIs or public URLs, and the API normalizes the format across all providers automatically.

Supported Models

The following models support image inputs:

Provider	Models
OpenAI	GPT-5.2, GPT-5.1, GPT-5, GPT-5 Mini, GPT-5 Nano, GPT-4.1, GPT-4o
Anthropic	Claude Opus 4.6, Claude Opus 4.5, Claude Sonnet 4.5, Claude Sonnet 4, Claude Sonnet 3.7
Google Vertex	Gemini 3 Pro, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 2.5 Flash
Mistral	Pixtral Large, Mistral Medium, Mistral Small, Magistral Medium
Cohere	Command A Vision
AWS Bedrock	Claude models (via Bedrock), OpenAI models (via Bedrock)
Azure	GPT-5, GPT-4o, Claude models (via Azure)
Z.AI	GLM-4.6V, GLM-4.5V

Use the Get Model endpoint to check if a specific model supports image inputs by looking for the image_processing field in the provider configuration.

Sending Images

Images are sent as content blocks within the input array. Use the input_image content type alongside input_text blocks.

Image Input Format

{
  "type": "input_image",
  "image_url": "data:image/png;base64,iVBOR...",
  "detail": "auto"
}

Properties:

type (required): "input_image"
image_url (required): Base64 data URI or public HTTPS URL
detail (optional): "low", "high", or "auto" (default: "auto")
- low: Faster processing, lower token cost, suitable for simple images
- high: Full resolution analysis, higher token cost, better for detailed images
- auto: Let the model decide based on the image

`detail` Parameter Support

The detail parameter controls image resolution for token estimation, but provider support varies depending on the underlying API format:

Provider	Accepts Detail	Notes
OpenAI	Yes	Passed through via the Responses API format
xAI	Yes	Passed through via the Responses API format
Cohere	Yes	Explicitly mapped to the Cohere API
Azure	Depends	Forwarded for OpenAI-format models; not applicable for Anthropic-format models
Anthropic	N/A	Anthropic’s API does not have a detail parameter
Google Vertex	N/A	Gemini’s API does not have a detail parameter
AWS Bedrock	N/A	Bedrock’s native image format does not have a detail parameter
Mistral	Yes	Mistral’s Conversations API does not have a detail parameter
Z.AI	N/A	Z.AI does not have a detail parameter

For providers marked N/A, their native APIs have no equivalent concept — the provider determines image processing resolution automatically.

Examples

Base64 Image

curl https://api.concentrate.ai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-5.2",
    "input": [
      {
        "role": "user",
        "content": [
          {
            "type": "input_text",
            "text": "What do you see in this image?"
          },
          {
            "type": "input_image",
            "image_url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUg..."
          }
        ]
      }
    ]
  }'

URL Image

You can also pass a publicly accessible image URL:

{
  "model": "claude-sonnet-4-5",
  "input": [
    {
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "Describe this image in detail."
        },
        {
          "type": "input_image",
          "image_url": "https://example.com/photo.jpg"
        }
      ]
    }
  ]
}

Image URLs must be publicly accessible. Private URLs and URLs requiring authentication may be blocked.

Multiple Images

Send multiple images in a single request:

{
  "model": "gemini-2.5-pro",
  "input": [
    {
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "Compare these two images and describe the differences."
        },
        {
          "type": "input_image",
          "image_url": "https://example.com/image1.jpg"
        },
        {
          "type": "input_image",
          "image_url": "https://example.com/image2.jpg"
        }
      ]
    }
  ]
}

With Streaming

Multi-modal requests work with streaming:

{
  "model": "gpt-5.2",
  "stream": true,
  "input": [
    {
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "Describe this image."
        },
        {
          "type": "input_image",
          "image_url": "data:image/jpeg;base64,/9j/4AAQ...",
          "detail": "high"
        }
      ]
    }
  ]
}

Supported Formats

Format	MIME Type	Data URI Prefix
PNG	`image/png`	`data:image/png;base64,`
JPEG	`image/jpeg`	`data:image/jpeg;base64,`
GIF	`image/gif`	`data:image/gif;base64,`
WebP	`image/webp`	`data:image/webp;base64,`

Some providers only support a subset of these image types. You can check the specific model info for specific information on supported model types.

Limits

Image limits vary by provider and model. Exceeding these limits will return a 400 error.

Provider	Max Images Per Request	Max Total Size
OpenAI (GPT-5, GPT-4o)	500	50 MB
Anthropic (Claude)	100	32 MB
Google Vertex (Gemini 3)	900	7 MB
Google Vertex (Gemini 2.5)	3,000	7 MB
Cohere (Command A Vision)	20	20 MB
Mistral (Pixtral Large)	8	10 MB

Image tokens are calculated based on per provider algorithms. Higher resolution images consume more tokens. For providers that support it, consider setting detail to “low” to reduce costs.

Error Handling

Common errors when using image inputs:

Error	Cause
`Model does not support image inputs`	The selected model does not have vision capabilities
`Too many images`	Request exceeds the model’s `max_images_per_request` limit
`Image size exceeds limit`	Total image data exceeds the model’s `max_total` size limit
`Invalid image format`	Image is not PNG, JPEG, GIF, or WebP
`Invalid image URL`	URL is not a valid HTTP/HTTPS URL or data URI

Request Parameters

Complete parameter reference

Streaming

Use multi-modal with streaming

Create Response

Main endpoint documentation

List Models

Check model capabilities

API documentation

Responses

Chat Completions (Beta)

Messages (Beta)

Models

Utilities

Features

Reference

Overview

Supported Models

Sending Images

Image Input Format

`detail` Parameter Support

Examples

Base64 Image

URL Image

Multiple Images

With Streaming

Supported Formats

Limits

Error Handling

Request Parameters

Streaming

Create Response

List Models

API documentation

Responses

Chat Completions (Beta)

Messages (Beta)

Models

Utilities

Features

Reference

Documentation Index

​Overview

​Supported Models

​Sending Images

​Image Input Format

​detail Parameter Support

​Examples

​Base64 Image

​URL Image

​Multiple Images

​With Streaming

​Supported Formats

​Limits

​Error Handling

​Related Documentation

Request Parameters

Streaming

Create Response

List Models

Overview

Supported Models

Sending Images

Image Input Format

`detail` Parameter Support

Examples

Base64 Image

URL Image

Multiple Images

With Streaming

Supported Formats

Limits

Error Handling

Related Documentation