Tool Calling - concentrate

Overview

Tool calling (also known as function calling) enables AI models to request the execution of external functions and tools. Instead of trying to generate structured data directly, the model can indicate it wants to call a function with specific parameters, your code executes the function, and then you send the result back to continue the conversation. Use tool calling when you need to:

Access real-time data (weather, stock prices, databases)
Perform calculations or data processing
Interact with external APIs and services
Execute actions on behalf of users (send emails, create calendar events)
Retrieve information the model doesn’t have in its training data

Prerequisites

Before using tool calling, ensure you have:

A Concentrate AI API key (get one here)
A model that supports tool calling (refer to the models endpoint to see models that support tool calling)
An understanding of JSON Schema for defining function parameters

Quick Start

Here’s a basic example of tool calling with a weather function:

curl https://api.concentrate.ai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-5.2",
    "input": "What is the weather in San Francisco?",
    "tools": [
      {
        "type": "function",
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "City and state, e.g. San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": ["celsius", "fahrenheit"],
              "description": "Temperature unit"
            }
          },
          "required": ["location"],
          "additionalProperties": false
        }
      }
    ]
  }'

Defining Tools

Each tool is defined using the FunctionTool schema. Here are the properties:

Property	Type	Required	Description
`type`	string	Yes	Must be `"function"`
`name`	string	Yes	Function name (alphanumeric, underscores, dots, hyphens only)
`description`	string	No	Clear description of what the function does
`parameters`	object	Yes	JSON Schema defining the function’s parameters
`strict`	boolean	No	Enable strict schema validation (default: `true`)
`cache_control`	object	No	Cache this tool definition for better performance

Tool Name Pattern

Tool names must match the pattern: ^[a-zA-Z0-9_.-]+$ Valid names:

get_weather
send_email
user.get_profile
calculate-sum

Invalid names:

get weather (contains space)
send@email (contains @)

Parameters as JSON Schema

The parameters field uses JSON Schema to define the function’s inputs. This helps the model understand what data to provide.

{
  "parameters": {
    "type": "object",
    "properties": {
      "location": {
        "type": "string",
        "description": "City and state, e.g. San Francisco, CA"
      },
      "unit": {
        "type": "string",
        "enum": ["celsius", "fahrenheit"],
        "description": "Temperature unit"
      },
      "include_forecast": {
        "type": "boolean",
        "description": "Include 5-day forecast"
      }
    },
    "required": ["location"]
  }
}

Best practices for parameters:

Always include description fields to help the model understand the purpose
Use enum for fields with fixed options
Mark essential parameters in the required array
Keep schemas simple and focused on the task

Strict Mode

By default, strict: true is enabled. This ensures the model’s output strictly follows your schema.

When strict is true, your parameters schema must include "additionalProperties": false. Omitting it will return a 400 Bad Request error. This is a common mistake.

{
  "type": "function",
  "name": "create_user",
  "parameters": {
    "type": "object",
    "properties": {
      "name": { "type": "string" },
      "email": { "type": "string" }
    },
    "required": ["name", "email"],
    "additionalProperties": false
  },
  "strict": true
}

With strict: true, the model will only include name and email in its output—no additional properties. Set strict: false if you don’t want to enforce additionalProperties.

Cache Control

Cache tool definitions to improve performance and reduce costs:

{
  "type": "function",
  "name": "get_weather",
  "description": "Get current weather for a location",
  "parameters": { ... },
  "cache_control": {
    "type": "ephemeral",
    "ttl": "5m"
  }
}

See Prompt Caching for more details.

Tool Choice Modes

Control which tools the model can use with the tool_choice parameter:

None - Disable Tool Calling

{
  "tool_choice": "none"
}

The model will respond normally without calling any tools, even if they’re available.

Auto - Let Model Decide (Default)

{
  "tool_choice": "auto"
}

The model decides whether to call a tool based on the user’s input. This is the default behavior.

Required - Force Tool Use

{
  "tool_choice": "required"
}

The model must call at least one tool before responding. Useful when you always need a function call.

Specific Tool - Force Specific Function

{
  "tool_choice": {
    "type": "function",
    "name": "get_weather"
  }
}

The model must call the specified tool. Use this when you know exactly which function should be called.

Allowed Tools - Limit to Subset

{
  "tool_choice": {
    "type": "allowed_tools",
    "mode": "auto",
    "tools": [
      { "type": "function", "name": "get_weather" },
      { "type": "function", "name": "get_forecast" }
    ]
  }
}

Restrict the model to only use specific tools from your full tool list. The mode can be:

"auto" - Model can use these tools or skip them
"required" - Model must use at least one of these tools

Parallel Tool Calls

Enable the model to call multiple tools simultaneously:

{
  "parallel_tool_calls": true
}

When to enable:

Multiple independent operations (e.g., get weather for multiple cities)
No dependencies between tool calls
Want faster results through parallelization

When to disable:

Sequential operations where order matters
Tool calls depend on each other’s results
Want more predictable, step-by-step execution

Example of parallel tool calls:

# User asks: "What's the weather in SF, NYC, and London?"
# Model can call get_weather three times in parallel:
{
  "output": [
    {"type": "function_call", "call_id": "call_1", "name": "get_weather", "arguments": '{"location": "San Francisco, CA"}'},
    {"type": "function_call", "call_id": "call_2", "name": "get_weather", "arguments": '{"location": "New York, NY"}'},
    {"type": "function_call", "call_id": "call_3", "name": "get_weather", "arguments": '{"location": "London, UK"}'}
  ]
}

Multi-Turn Workflow

Tool calling typically follows this pattern:

User sends a message with tools available
Model responds with function_call indicating it wants to use a tool
You execute the function in your application
You send the result back with function_call_output
Model provides final response using the tool result

Here’s a complete example:

import requests
import json

API_URL = "https://api.concentrate.ai/v1/responses"
HEADERS = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

# Define your tools
tools = [{
    "type": "function",
    "name": "get_weather",
    "description": "Get current weather for a location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {"type": "string"},
            "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
        },
        "required": ["location"]
    }
}]

# Step 1: Initial request
response1 = requests.post(API_URL, headers=HEADERS, json={
    "model": "gpt-5.2",
    "input": "What's the weather like in San Francisco?",
    "tools": tools
})

data1 = response1.json()
print("Step 1 - Model response:", data1)

# Step 2: Check if model wants to call a tool
function_call = None
for item in data1["output"]:
    if item["type"] == "function_call":
        function_call = item
        break

if function_call:
    # Step 3: Execute the function
    args = json.loads(function_call["arguments"])

    # Your actual function implementation
    def get_weather(location, unit="fahrenheit"):
        # In reality, you'd call a weather API
        return {
            "location": location,
            "temperature": 72,
            "unit": unit,
            "conditions": "Sunny"
        }

    result = get_weather(**args)
    print("Step 2 - Function result:", result)

    # Step 4: Send result back to model
    response2 = requests.post(API_URL, headers=HEADERS, json={
        "model": "gpt-5.2",
        "input": [
            {"role": "user", "content": "What's the weather like in San Francisco?"},
            function_call,  # Include the function call
            {
                "type": "function_call_output",
                "call_id": function_call["call_id"],
                "output": json.dumps(result)
            }
        ],
        "tools": tools
    })

    data2 = response2.json()
    print("Step 3 - Final response:", data2)

    # Extract the assistant's final message
    for item in data2["output"]:
        if item["type"] == "message" and item["role"] == "assistant":
            print("\nAssistant:", item["content"][0]["text"])

Streaming with Tools

When streaming is enabled, tool calls are delivered incrementally:

Event Types

response.function_call_arguments.delta - Incremental tool arguments as they’re generated
response.function_call_arguments.done - Complete tool call with final arguments

Example

TypeScript - Streaming Tool Calls

const response = await fetch("https://api.concentrate.ai/v1/responses", {
  method: "POST",
  headers: {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    model: "gpt-5.2",
    input: "What's the weather in San Francisco?",
    tools: [{
      type: "function",
      name: "get_weather",
      description: "Get current weather",
      parameters: {
        type: "object",
        properties: {
          location: { type: "string" }
        },
        required: ["location"]
      }
    }],
    stream: true
  })
});

const reader = response.body!.getReader();
const decoder = new TextDecoder();

let functionCallBuffer = "";

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const chunk = decoder.decode(value);
  const lines = chunk.split("\n").filter(line => line.trim());

  for (const line of lines) {
    if (line.startsWith("data: ")) {
      const data = JSON.parse(line.slice(6));

      if (data.event === "response.function_call_arguments.delta") {
        functionCallBuffer += data.data.arguments;
        console.log("Building arguments:", functionCallBuffer);
      }

      if (data.event === "response.function_call_arguments.done") {
        console.log("Complete function call:", data.data);
        // Execute your function here
      }
    }
  }
}

See Streaming for more details on streaming events.

Advanced Patterns

Error Handling

When a tool execution fails, use the is_error flag:

{
  "type": "function_call_output",
  "call_id": "call_abc123",
  "output": json.dumps({"error": "Weather API unavailable"}),
  "is_error": True
}

The model will receive the error and can respond appropriately (e.g., apologizing or suggesting alternatives).

Caching Tool Definitions

For frequently used tools, cache their definitions:

{
  "tools": [
    {
      "type": "function",
      "name": "get_weather",
      "description": "Get current weather for a location",
      "parameters": { ... },
      "cache_control": {
        "type": "ephemeral",
        "ttl": "1h"
      }
    }
  ]
}

This reduces costs and improves latency for repeated requests with the same tools.

Provider Support

Tool calling is supported across all major providers:

Provider	Supported	Notes
OpenAI	✅ Yes	Full support including parallel calls
Anthropic	✅ Yes	Full support with Claude models
Azure OpenAI	✅ Yes	Same as OpenAI
AWS Bedrock	✅ Yes	Supported on compatible models
Google Vertex AI	✅ Yes	Supported on Gemini models
Cohere	✅ Yes	Supported on Command models
Mistral	✅ Yes	Supported on recent models

Check specific model capabilities using the List Models endpoint.

Best Practices

Schema Design

Keep it simple - Only include necessary parameters
Clear descriptions - Help the model understand what each field does
Use enums - For fields with fixed options
Validate inputs - Always validate tool arguments before execution
Handle errors gracefully - Return meaningful error messages

Security Considerations

Validate all inputs - Never trust model-generated arguments blindly
Limit permissions - Tools should have minimal necessary permissions
Rate limiting - Implement rate limits on expensive operations
Audit logs - Log all tool executions for monitoring
Sanitize outputs - Clean tool results before sending back to model

Performance Optimization

Cache definitions - Use cache_control for frequently used tools
Parallel calls - Enable when tools are independent
Lazy loading - Only load tools when needed
Timeout handling - Set reasonable timeouts for tool execution
Batch operations - Combine multiple related calls when possible

Troubleshooting

Model doesn’t call the tool

Problem: Model responds with text instead of calling the tool. Solutions:

Make the tool description clearer and more specific
Use tool_choice: "required" to force tool usage
Ensure the user’s input clearly requires the tool’s functionality
Check if the model supports tool calling

Invalid arguments

Problem: Model provides incorrect or incomplete arguments. Solutions:

Add detailed descriptions to parameter fields
Use strict: true to enforce schema validation
Include examples in parameter descriptions
Mark essential fields as required

Tool not found error

Problem: "Tool 'xyz' not found" error. Solutions:

Verify tool name matches exactly (case-sensitive)
Ensure tool name follows the pattern: ^[a-zA-Z0-9_.-]+$
Check that tools array is included in the request

Streaming issues

Problem: Tool calls not appearing in streaming mode. Solutions:

Listen for response.function_call_arguments.delta and .done events
Buffer the arguments until the .done event
Check that stream: true is set in the request

Create Response

Main API endpoint for generating responses

Request Parameters

Complete parameter reference

Streaming

Real-time response streaming

Prompt Caching

Cache tool definitions for better performance

API documentation

Responses

Chat Completions (Beta)

Messages (Beta)

Models

Utilities

Features

Reference

Documentation Index

​Overview

​Prerequisites

​Quick Start

​Defining Tools

​Tool Name Pattern

​Parameters as JSON Schema

​Strict Mode

​Cache Control

​Tool Choice Modes

​None - Disable Tool Calling

​Auto - Let Model Decide (Default)

​Required - Force Tool Use

​Specific Tool - Force Specific Function

​Allowed Tools - Limit to Subset

​Parallel Tool Calls

​Multi-Turn Workflow

​Streaming with Tools

​Event Types

​Example

​Advanced Patterns

​Error Handling

​Caching Tool Definitions

​Provider Support

​Best Practices

​Schema Design

​Security Considerations

​Performance Optimization

​Troubleshooting

​Model doesn’t call the tool

​Invalid arguments

​Tool not found error

​Streaming issues

​Related Pages

Create Response

Request Parameters

Streaming

Prompt Caching

Overview

Prerequisites

Quick Start

Defining Tools

Tool Name Pattern

Parameters as JSON Schema

Strict Mode

Cache Control

Tool Choice Modes

None - Disable Tool Calling

Auto - Let Model Decide (Default)

Required - Force Tool Use

Specific Tool - Force Specific Function

Allowed Tools - Limit to Subset

Parallel Tool Calls

Multi-Turn Workflow

Streaming with Tools

Event Types

Example

Advanced Patterns

Error Handling

Caching Tool Definitions

Provider Support

Best Practices

Schema Design

Security Considerations

Performance Optimization

Troubleshooting

Model doesn’t call the tool

Invalid arguments

Tool not found error

Streaming issues

Related Pages