Model Fortress

Compare pricing, context windows, and capabilities across every major LLM provider

Claude Opus 4.7

Anthropic's most capable generally available model. Step-change improvement in agentic coding over Opus 4.6, with a new tokenizer and 1M context window.

Anthropic1.0M ctx|128K max out|$5.00/M in|$25.00/M out

Claude Opus 4.6

Anthropic's most recent model. Current leader on agentic coding evaluation, Terminal Bench 2.0, and Humanity's Last Exam.

Anthropic200K ctx|128K max out|$5.00/M in|$25.00/M out

Claude Opus 4.5

Anthropic's former flagship model combining maximum intelligence with practical performance.

Anthropic200K ctx|64K max out|$5.00/M in|$25.00/M out

Claude Sonnet 4.6

Anthropic's latest Sonnet model with hybrid reasoning, matching near-flagship performance at a fraction of the cost.

Anthropic200K ctx|64K max out|$3.00/M in|$15.00/M out

Claude Sonnet 4.5

Anthropic's latest Sonnet model featuring exceptional performance on coding, analysis, and instruction following.

Anthropic200K ctx|64K max out|$3.00/M in|$15.00/M out

Claude Haiku 4.5

Anthropic's fastest model with near-frontier intelligence, ideal for high-throughput applications.

Anthropic200K ctx|64K max out|$1.00/M in|$5.00/M out

Claude Opus 4.1

Anthropic's previous flagship model with maximum intelligence. Legacy model - consider using Opus 4.5 for new projects.

Anthropic200K ctx|32K max out|$15.00/M in|$75.00/M out

Claude Opus 4

Anthropic's original Opus 4 model. Legacy model - consider using Opus 4.5 for new projects.

Anthropic200K ctx|32K max out|$15.00/M in|$75.00/M out

Claude Sonnet 4

Anthropic's balanced model offering excellent performance across coding, analysis, and complex tasks. A great balance of speed, intelligence, and cost.

Anthropic200K ctx|64K max out|$3.00/M in|$15.00/M out

GPT 5.5

OpenAI's GPT-5.5 is a frontier model with a 1M+ context window, reasoning, and broad tool support including web search, file search, image generation, code interpreter, hosted shell, computer use, and MCP. Knowledge cutoff December 1, 2025.

OpenAI1.1M ctx|128K max out|$5.00/M in|$30.00/M out

GPT 5.4

OpenAI's GPT-5.4 is OpenAI's latest frontier model with a 1M+ context window, improved reasoning with xhigh effort support, and enhanced capabilities for coding, agentic tasks, and computer use.

OpenAI1.1M ctx|128K max out|$2.50/M in|$15.00/M out

GPT 5.4 Mini

OpenAI's efficient, cost-optimized variant of GPT-5.4 with strong reasoning and multimodal capabilities at a fraction of the cost.

OpenAI128K ctx|128K max out|$0.75/M in|$4.50/M out

GPT 5.4 Nano

OpenAI's most affordable and fastest variant of GPT-5.4, optimized for high-volume, latency-sensitive applications with minimal cost.

OpenAI128K ctx|128K max out|$0.20/M in|$1.25/M out

GPT 5.4 Pro

OpenAI's most powerful GPT-5.4 variant with a 1M+ context window, designed for the most demanding reasoning, coding, and agentic tasks with maximum capability.

OpenAI1.1M ctx|128K max out|$30.00/M in|$180.00/M out

GPT 5.2

OpenAI's latest and greatest, GPT 5.2 is OpenAI's flagship model for coding and agentic tasks across industries.

OpenAI400K ctx|128K max out|$1.75/M in|$14.00/M out

GPT 5.1

OpenAI's previous flagship model for coding and agentic tasks with configurable reasoning and non-reasoning effort.

OpenAI400K ctx|128K max out|$1.25/M in|$10.00/M out

GPT 5

OpenAI's former advanced reasoning model with enhanced problem-solving capabilities, deeper understanding, and improved accuracy across complex tasks.

OpenAI400K ctx|128K max out|$1.25/M in|$10.00/M out

GPT 5 Mini

OpenAI's smaller and faster variant of GPT-5, a more cost-efficient alternative. Great for well-defined tasks and precise prompts.

OpenAI400K ctx|128K max out|$0.25/M in|$2.00/M out

GPT 5 Nano

OpenAI's most cost-effective and efficient model in the GPT-5 series, optimized for speed and affordability. Ideal for straightforward tasks and high-volume applications where latency and cost are critical.

OpenAI400K ctx|128K max out|$0.05/M in|$0.40/M out

OpenAI o1

OpenAI's reasoning model designed to think before responding. Uses chain-of-thought reasoning for complex tasks in science, coding, and math.

OpenAI200K ctx|100K max out|$15.00/M in|$60.00/M out

GPT 4.1

OpenAI's improved version of GPT-4, featuring enhanced reasoning capabilities, better contextual understanding, and increased accuracy across a wide range of tasks.

OpenAI1.0M ctx|33K max out|$2.00/M in|$8.00/M out

GPT 4.1 Mini

OpenAI's balanced small model with a massive 1M token context window. Offers great performance at low cost, beating GPT-4o in many benchmarks.

OpenAI1.0M ctx|33K max out|$0.40/M in|$1.60/M out

GPT 4o

OpenAI's multimodal model optimized for speed and efficiency. Capable of processing text, images, and audio with high accuracy at reduced latency and cost compared to GPT-4 Turbo.

OpenAI128K ctx|16K max out|$2.50/M in|$10.00/M out

GPT 4o Mini

OpenAI's cost-efficient small model. Great for lightweight tasks with fast responses and low cost while maintaining strong capabilities.

OpenAI128K ctx|16K max out|$0.15/M in|$0.60/M out

OpenAI gpt-oss 120B

OpenAI's GPT-OSS open-weight model designed for powerful reasoning, agentic tasks, and versatile developer use cases.

OpenAI131K ctx|128K max out|$0.00/M in|$0.00/M out

OpenAI gpt-oss 20B

OpenAI's smaller open-weight model optimized for lower latency and specialized use-cases. Great for edge deployment and faster responses.

OpenAI131K ctx|128K max out|$0.00/M in|$0.00/M out

GPT 5.3 Codex

OpenAI's GPT-5.3-Codex is the most capable agentic coding model, combining frontier coding performance with reasoning capabilities. Features mid-task steering and 25% faster inference than GPT-5.2-Codex.

OpenAI400K ctx|128K max out|$1.75/M in|$14.00/M out

GPT 5.2 Codex

OpenAI's GPT-5.2-Codex is an upgraded version of GPT-5.2 optimized for agentic coding tasks in Codex or similar environments.

OpenAI400K ctx|128K max out|$1.75/M in|$14.00/M out

GPT 5.1 Codex Max

A version of GPT-5.1-Codex optimized for long-running tasks.

OpenAI400K ctx|128K max out|$1.25/M in|$10.00/M out

GPT 5.1 Codex Mini

Smaller, more cost-effective, less-capable version of GPT-5.1-Codex.

OpenAI400K ctx|128K max out|$0.25/M in|$2.00/M out

OpenAI gpt-oss Safeguard 120B

OpenAI's safety-focused open-weight model with 120B parameters. Designed for content moderation and safe AI applications.

OpenAI128K ctx|128K max out|$0.15/M in|$0.60/M out

OpenAI gpt-oss Safeguard 20B

OpenAI's efficient safety-focused open-weight model with 20B parameters. Lightweight content moderation and safety.

OpenAI128K ctx|128K max out|$0.07/M in|$0.20/M out

Gemini 3.1 Flash Lite Preview

Google's lightweight and efficient model in the 3.1 family, optimized for low latency and cost, in preview version.

Google1.0M ctx|66K max out|$0.25/M in|$1.50/M out

Gemini 3.1 Pro Preview

Google's latest model, in preview version.

Google1.0M ctx|66K max out|$2.00/M in|$12.00/M out

Gemini 3 Flash Preview

Google's latest model with the Flash line's focus on latency, efficiency, and cost, in preview version.

Google1.0M ctx|66K max out|$0.50/M in|$3.00/M out

Gemini 2.5 Pro

Google's most capable model for complex reasoning tasks. Features a 1M token context window and strong performance across benchmarks.

Google1.0M ctx|66K max out|$1.25/M in|$10.00/M out

Gemini 2.5 Flash

Google's fast and cost-effective model with a 1M token context window. Best for high-volume, low-latency tasks and agentic use cases.

Google1.0M ctx|66K max out|$0.30/M in|$2.50/M out

Gemini 2.0 Flash

Google's fast and efficient model with a 1M token context window. Optimized for speed and cost-effectiveness with multimodal support.

Google1.0M ctx|8K max out|$0.15/M in|$0.60/M out

Gemma 3 12B

Google's lightweight open model well-suited for text generation and image understanding. Supports LoRA fine-tuning with 128K context.

Google128K ctx|8K max out|$0.09/M in|$0.29/M out

Gemma 3 4B IT

Google's lightweight 4B parameter Gemma 3 model. Efficient for simple tasks with vision capabilities.

Google128K ctx|8K max out|$0.04/M in|$0.08/M out

Gemma 3 27B

Google's largest Gemma 3 model with 27B parameters. Strong performance across reasoning, coding, and multilingual tasks with vision support.

Google128K ctx|8K max out|$0.23/M in|$0.38/M out

Gemma 4 26B A4B

Efficient MoE variant of Gemma 4 from Google DeepMind. Multimodal (text + image input), text output.

Google262K ctx|16K max out|$0.07/M in|$0.34/M out

Grok Code Fast

xAI's specialized coding model optimized for fast code generation, completion, and explanation. 256K context window.

xAI256K ctx|131K max out|$0.20/M in|$1.50/M out

Grok 4.20 Multi-Agent

xAI's multi-agent model with 2M token context window. Optimized for multi-agent workflows with reasoning capabilities. No client-side or custom tools: Client-side tools (function calling) and custom tools are not currently supported by the multi-agent model variant.

xAI2.0M ctx|131K max out|$2.00/M in|$6.00/M out

Grok 4.20 Reasoning

xAI's Grok 4.20 reasoning model with 2M token context window. Supports reasoning, function calling, structured outputs, and vision.

xAI2.0M ctx|131K max out|$2.00/M in|$6.00/M out

Grok 4.20 Non Reasoning

xAI's Grok 4.20 non-reasoning model with 2M token context window. Optimized for speed without reasoning overhead, supports function calling, structured outputs, and vision.

xAI2.0M ctx|131K max out|$2.00/M in|$6.00/M out

Grok 4.1 Fast Reasoning

xAI's latest fast reasoning model with massive 2M token context window. Optimized for speed with reasoning capabilities enabled.

xAI2.0M ctx|131K max out|$0.20/M in|$0.50/M out

Grok 4.1 Fast

xAI's latest fast model with massive 2M token context window. Optimized for speed without reasoning overhead for straightforward tasks.

xAI2.0M ctx|131K max out|$0.20/M in|$0.50/M out

Grok 4 (0709)

xAI's Grok 4 model snapshot from July 2025. Premium pricing for maximum capability with 256K context.

xAI256K ctx|131K max out|$3.00/M in|$15.00/M out

Grok 4

xAI's latest and greatest flagship model, offering unparalleled performance in natural language, math, and reasoning - the perfect jack of all trades.

xAI256K ctx|256K max out|$3.00/M in|$15.00/M out

Grok 4 Fast Reasoning

xAI's fast Grok 4 model with reasoning capabilities and massive 2M token context window. Great balance of speed and intelligence.

xAI2.0M ctx|131K max out|$0.20/M in|$0.50/M out

Grok 4 Fast

xAI's fast Grok 4 model without reasoning overhead. Massive 2M context window for processing large documents quickly.

xAI2.0M ctx|131K max out|$0.20/M in|$0.50/M out

Grok 3

xAI's former flagship reasoning model with strong performance in natural language, math, and coding. Features a 131K context window.

xAI131K ctx|131K max out|$3.00/M in|$15.00/M out

Grok 3 Mini

xAI's cost-efficient model with reasoning capabilities. Surprisingly outperforms Grok 3 in many benchmarks while costing 90% less.

xAI131K ctx|131K max out|$0.30/M in|$0.50/M out

DeepSeek R1

DeepSeek's reasoning model trained with reinforcement learning. Excels at math, code, and complex reasoning tasks.

DeepSeek128K ctx|33K max out|$1.35/M in|$5.40/M out

DeepSeek R1 Distill 32B

DeepSeek's reasoning model distilled from R1 based on Qwen2.5. Outperforms OpenAI o1-mini across various benchmarks with state-of-the-art dense model results.

DeepSeek80K ctx|16K max out|$0.50/M in|$4.88/M out

DeepSeek V3.2

DeepSeek's latest general-purpose model with improved reasoning, coding, and instruction-following capabilities.

DeepSeek164K ctx|66K max out|$0.26/M in|$0.38/M out

DeepSeek V4 Flash

DeepSeek's fast, cost-efficient general-purpose model with 1M context window. Supports tool calling and thinking mode.

DeepSeek1.0M ctx|384K max out|$0.14/M in|$0.28/M out

DeepSeek V4 Pro

DeepSeek's flagship reasoning model with 1M context window. Supports tool calling, extended thinking, and high-quality generation.

DeepSeek1.0M ctx|384K max out|$1.74/M in|$3.48/M out

Qwen3 30B

Alibaba's Qwen3 model with groundbreaking advancements in reasoning and instruction-following. Excellent for complex tasks and coding.

Alibaba Cloud33K ctx|8K max out|$0.05/M in|$0.34/M out

QwQ 32B

Alibaba's QwQ reasoning model optimized for analytical tasks. Strong performance on math, coding, and logical reasoning benchmarks.

Alibaba Cloud24K ctx|16K max out|$0.66/M in|$1.00/M out

Qwen3 32B

Alibaba's Qwen3 dense 32B model. Strong performance across reasoning, math, and coding tasks.

Alibaba Cloud32K ctx|8K max out|$0.15/M in|$0.60/M out

Qwen3 Coder 30B A3B

Alibaba's Qwen3 coding-focused model with 30B total / 3B active parameters using mixture-of-experts architecture.

Alibaba Cloud256K ctx|66K max out|$0.15/M in|$0.60/M out

Qwen3 Coder Next

Alibaba's latest Qwen3 coding model. Cutting-edge code generation and understanding capabilities.

Alibaba Cloud256K ctx|66K max out|$0.50/M in|$1.20/M out

Qwen3 Next 80B A3B

Alibaba's latest Qwen3 model with 80B total / 3B active parameters using mixture-of-experts. Strong general-purpose performance.

Alibaba Cloud128K ctx|8K max out|$0.15/M in|$1.20/M out

Qwen3 VL 235B A22B

Alibaba's Qwen3 vision-language model with 235B total / 22B active parameters. Supports text and image inputs.

Alibaba Cloud128K ctx|8K max out|$0.53/M in|$2.66/M out

Kimi K2 Thinking

Moonshot AI's trillion-parameter MoE reasoning model (32B activated). Excels at multi-step reasoning with 200+ sequential tool calls. Supports function calling and extended thinking.

Moonshot AI256K ctx|66K max out|$0.60/M in|$2.50/M out

Kimi K2.5

Moonshot AI's native multimodal agentic model built on K2. Excels at visual coding, reasoning, and self-directed agent swarm with up to 100 sub-agents.

Moonshot AI256K ctx|33K max out|$0.60/M in|$3.00/M out

Kimi K2.6

Moonshot AI's next-gen agentic model built on K2. Long-horizon coding, proactive autonomous execution, and swarm-based task orchestration.

Moonshot AI262K ctx|16K max out|$0.75/M in|$3.50/M out

MiniMax M2

MiniMax's flagship agentic language model with 200K context window. Supports function calling and reasoning.

MiniMax205K ctx|8K max out|$0.30/M in|$1.20/M out

MiniMax M2.1

MiniMax's lightweight MoE model optimized for coding, agentic workflows, and modern application development. 10B activated parameters with strong multilingual code generation.

MiniMax205K ctx|8K max out|$0.30/M in|$1.20/M out

MiniMax M2.1 Highspeed

MiniMax M2.1 optimized for speed at ~100 tokens/second with 200K context window.

MiniMax205K ctx|8K max out|$0.30/M in|$1.20/M out

MiniMax M2.5

MiniMax's latest flagship model with 200K context window, ~60 tokens/second. Supports function calling and thinking.

MiniMax205K ctx|8K max out|$0.30/M in|$1.20/M out

MiniMax M2.5 Highspeed

MiniMax M2.5 optimized for speed at ~100 tokens/second with 200K context window.

MiniMax205K ctx|8K max out|$0.60/M in|$2.40/M out

MiniMax M2.7

MiniMax's latest flagship model with 200K context window and 131K max output. Advanced agentic capabilities with multi-agent collaboration, strong coding, and tool calling.

MiniMax205K ctx|131K max out|$0.30/M in|$1.20/M out

MiniMax M2.7 Highspeed

MiniMax M2.7 optimized for speed with 200K context window and 131K max output.

MiniMax205K ctx|131K max out|$0.60/M in|$2.40/M out

GLM-5

ZhipuAI's most capable reasoning model with 128K context window. Supports tool calling, web search, and thinking.

z.ai200K ctx|128K max out|$1.00/M in|$3.20/M out

GLM-5.1

Z-AI's next-generation flagship model for agentic engineering. Stronger coding capabilities and state-of-the-art performance on SWE-Bench Pro.

z.ai203K ctx|16K max out|$1.05/M in|$3.50/M out

GLM-4.7

ZhipuAI's advanced reasoning model with 128K context window. Supports tool calling, web search, and thinking.

z.ai200K ctx|128K max out|$0.60/M in|$2.20/M out

GLM-4.6

ZhipuAI's reasoning model with 128K context window. Supports tool calling, web search, and thinking.

z.ai200K ctx|128K max out|$0.60/M in|$2.20/M out

GLM-4.5

ZhipuAI's efficient reasoning model with 128K context window. Supports tool calling, web search, and thinking.

z.ai128K ctx|96K max out|$0.60/M in|$2.20/M out

GLM-4.6v

ZhipuAI's vision-language model with 128K context window. Supports image inputs and thinking.

z.ai131K ctx|33K max out|$0.30/M in|$0.90/M out

GLM-4.5v

ZhipuAI's efficient vision-language model with 128K context window. Supports image inputs and thinking.

z.ai131K ctx|16K max out|$0.60/M in|$1.80/M out

GLM-4.7 Flash

ZhipuAI's fast and efficient model variant. Optimized for speed while maintaining strong language capabilities.

z.ai200K ctx|128K max out|$0.07/M in|$0.40/M out

Llama 4 Scout

Meta's latest Llama 4 model with 17B active parameters using mixture-of-experts architecture.

Meta131K ctx|8K max out|$0.17/M in|$0.66/M out

Llama 3.3 70B Instruct

Meta's powerful 70B parameter model quantized to FP8 for fast inference. Excellent for complex reasoning and multilingual tasks.

Meta128K ctx|8K max out|$0.29/M in|$0.72/M out

Llama 3.2 3B Instruct

Meta's compact 3B parameter model optimized for edge deployment and multilingual dialogue. Great balance of speed and capability.

Meta128K ctx|8K max out|$0.05/M in|$0.15/M out

Llama 3.2 1B Instruct

Meta's compact and efficient 1 billion parameter model designed for on-device and edge deployment. Optimized for instruction following with strong performance despite its small size, ideal for resource-constrained environments.

Meta128K ctx|8K max out|$0.03/M in|$0.10/M out

Llama 3.1 8B Instruct

Meta's efficient 8B parameter model optimized for multilingual dialogue. Fast inference with great performance for everyday tasks.

Meta128K ctx|8K max out|$0.04/M in|$0.22/M out

Llama 4 Maverick

Meta's Llama 4 Maverick with 17B active parameters using mixture-of-experts. Excels at coding, reasoning, and multilingual tasks.

Meta1.0M ctx|8K max out|$0.24/M in|$0.97/M out

Llama 3.2 11B Instruct

Meta's 11B multimodal model supporting text and image inputs. Optimized for visual reasoning and image understanding tasks.

Meta128K ctx|8K max out|$0.16/M in|$0.16/M out

Llama 3.2 90B Instruct

Meta's largest multimodal model with 90B parameters. Supports text and image inputs with strong reasoning capabilities.

Meta128K ctx|8K max out|$0.72/M in|$0.72/M out

Llama 3.1 70B Instruct

Meta's 70B parameter model with 128K context. Strong performance on reasoning, coding, and multilingual tasks.

Meta128K ctx|8K max out|$0.72/M in|$0.72/M out

Llama 3 70B Instruct

Meta's original Llama 3 70B model optimized for dialogue. Strong general-purpose performance across a wide range of tasks.

Meta8K ctx|2K max out|$2.65/M in|$3.50/M out

Llama 3 8B Instruct

Meta's efficient Llama 3 8B model optimized for dialogue. Fast inference suitable for lightweight tasks.

Meta8K ctx|2K max out|$0.30/M in|$0.60/M out

Codestral

Mistral's specialized model for code generation. Optimized for coding tasks including code completion, generation, and explanation.

Mistral128K ctx|32K max out|$0.30/M in|$0.90/M out

Devstral 2

Mistral's frontier code agents model for solving software engineering tasks. Excels at using tools to explore codebases, editing multiple files, and powering software engineering agents. 256K context.

Mistral256K ctx|32K max out|$0.40/M in|$2.00/M out

Magistral Medium 1.2

Mistral's frontier-class multimodal reasoning model.

Mistral128K ctx|128K max out|$2.00/M in|$5.00/M out

Magistral Small 1.2

Mistral's reasoning-focused small model with vision capabilities. Optimized for step-by-step reasoning tasks.

Mistral128K ctx|128K max out|$0.50/M in|$1.50/M out

Mistral Large 3

Mistral's flagship 675B parameter model with state-of-the-art reasoning, coding, and multilingual capabilities with vision support.

Mistral256K ctx|32K max out|$0.50/M in|$1.50/M out

Mistral Medium 3

Mistral's frontier-class multimodal model released May 2025.

Mistral128K ctx|32K max out|$0.40/M in|$2.00/M out

Mistral Medium 3.1

Multimodal model from Mistral, released August 2025. Improved tone and performance.

Mistral128K ctx|32K max out|$0.40/M in|$2.00/M out

Mistral Nemo

Mistral's best multilingual open source model released July 2024.

Mistral128K ctx|32K max out|$0.15/M in|$0.15/M out

Mistral Small 3.1

Mistral's efficient 24B model optimized for simple tasks with low latency and 128K context. Great for classification, customer support, text generation, and multimodal tasks.

Mistral128K ctx|8K max out|$0.35/M in|$0.56/M out

Mistral Small

Mistral's efficient 24B model optimized for simple tasks with low latency and 128K context. Great for classification, customer support, text generation, and multimodal tasks.

Mistral128K ctx|32K max out|$0.10/M in|$0.30/M out

Ministral 3 3B

Mistral's ultra-efficient 3B parameter model with vision support. Designed for edge and low-latency applications.

Mistral256K ctx|32K max out|$0.10/M in|$0.10/M out

Ministral 3 8B

Mistral's efficient 8B parameter model with vision support. Good balance of capability and speed for moderate tasks.

Mistral256K ctx|32K max out|$0.15/M in|$0.15/M out

Ministral 3 14B

Mistral's mid-range 14B parameter model with vision support. Enhanced reasoning over smaller Ministrals.

Mistral256K ctx|32K max out|$0.20/M in|$0.20/M out

Palmyra X4

Writer's enterprise-grade model optimized for business content generation, analysis, and transformation.

Writer128K ctx|8K max out|$2.50/M in|$10.00/M out

Palmyra X5

Writer's most capable enterprise model with enhanced reasoning, analysis, and content generation capabilities.

Writer128K ctx|8K max out|$0.60/M in|$6.00/M out

Jamba 1.5 Large

AI21's Jamba 1.5 Large model with hybrid SSM-Transformer architecture. Excels at long-context understanding and generation.

AI21 Labs256K ctx|4K max out|$2.00/M in|$8.00/M out

Jamba 1.5 Mini

AI21's efficient Jamba 1.5 Mini model with hybrid SSM-Transformer architecture. Fast and cost-effective for everyday tasks.

AI21 Labs256K ctx|4K max out|$0.20/M in|$0.40/M out

Command A

Command A is Cohere's most performant model to date, excelling at tool use, agents, retrieval augmented generation (RAG), and multilingual use cases.

Cohere256K ctx|8K max out|$2.50/M in|$10.00/M out

Command A Vision

Cohere's first multimodal model capable of understanding and interpreting visual data alongside text.

Cohere128K ctx|8K max out|$2.50/M in|$10.00/M out

IBM Granite Micro

IBM's ultra-efficient micro model. Small but mighty - perfect for simple tasks requiring minimal latency and cost.

IBM131K ctx|4K max out|$0.02/M in|$0.11/M out

Amazon Nova Lite

Amazon's multimodal model for processing images, video, and text. Can analyze multiple images with 300K context.

Amazon300K ctx|5K max out|$0.06/M in|$0.24/M out

Amazon Nova Micro

Amazon's fastest text-only model optimized for speed and cost. Ideal for text-based tasks requiring low latency with 128K context window.

Amazon128K ctx|5K max out|$0.04/M in|$0.14/M out

Amazon Nova Pro

Amazon's highly capable multimodal model balancing accuracy, speed, and cost. Processes text and images with 300K context.

Amazon300K ctx|5K max out|$0.80/M in|$3.20/M out

Amazon Nova Premier

Amazon's most capable multimodal model for complex reasoning tasks. Processes text and images with up to 1M context window.

Amazon1.0M ctx|20K max out|$2.50/M in|$12.50/M out

Amazon Nova 2 Lite

Amazon's second-generation lightweight multimodal model. Processes text and images with improved performance over Nova Lite.

Amazon256K ctx|5K max out|$0.30/M in|$2.50/M out

Access all models through one API

Stop juggling multiple provider SDKs. Concentrate gives you a single endpoint for every model listed above, with built-in guardrails, analytics, and spend management.

Get Started Free

Frequently Asked Questions

What is the Concentrate.ai Model Fortress?

The Concentrate.ai Model Fortress is a live catalog of every LLM accessible through the Concentrate.ai API. It shows provider-specific pricing (input and output cost per million tokens), context window sizes, and capabilities like function calling, vision, streaming, and JSON mode for each model across every available provider.

How often is the Model Fortress updated?

The Model Fortress pulls live data from the Concentrate.ai model catalog API and revalidates every hour. Pricing and availability reflect the current state of each provider.

Which AI providers are included?

The Model Fortress includes models from OpenAI, Anthropic, Google, Meta (Llama), DeepSeek, Mistral, Cohere, xAI, and many more. Each model shows all providers that offer it, so you can compare the same model across different providers and pick the cheapest or best-fit option.