Claude Opus 4.6
$11.000/M
Claude Opus 4.5
$33.000/M
Claude Sonnet 3.7
$6.600/M
Claude Opus 3
$33.000/M
Claude 2.1
$12.800/M
Claude 2
$12.800/M
GPT-5
$3.875/M
GPT-4.5
$97.500/M
GPT-4 Turbo Preview
$16.000/M
GPT-4
$39.000/M
GPT-4-32k
$78.000/M
o3
$19.000/M
o3-mini
$2.090/M
o4-mini
$2.090/M
o1
$28.500/M
o1-mini
$5.700/M
o1-preview
$28.500/M
Gemini 2.5 Pro
$3.875/M
Gemini 1.5 Pro
$2.375/M
Gemini 1.0 Ultra
$12.000/M
Gemini 1.0 Pro
$0.800/M
PaLM 2 Bison
$0.500/M
PaLM 2 Unicorn
$5.000/M
Gemma 3 27B
$0.270/M
Grok 3
$6.600/M
Grok 2
$4.400/M
Grok 1.5
$8.000/M
DeepSeek-V3
$0.519/M
DeepSeek-V3-0324
$0.519/M
DeepSeek-R1
$1.042/M
Claude Opus 4.6
$11.000/M
Claude Opus 4.5
$33.000/M
Claude Sonnet 3.7
$6.600/M
Claude Opus 3
$33.000/M
Claude 2.1
$12.800/M
Claude 2
$12.800/M
GPT-5
$3.875/M
GPT-4.5
$97.500/M
GPT-4 Turbo Preview
$16.000/M
GPT-4
$39.000/M
GPT-4-32k
$78.000/M
o3
$19.000/M
o3-mini
$2.090/M
o4-mini
$2.090/M
o1
$28.500/M
o1-mini
$5.700/M
o1-preview
$28.500/M
Gemini 2.5 Pro
$3.875/M
Gemini 1.5 Pro
$2.375/M
Gemini 1.0 Ultra
$12.000/M
Gemini 1.0 Pro
$0.800/M
PaLM 2 Bison
$0.500/M
PaLM 2 Unicorn
$5.000/M
Gemma 3 27B
$0.270/M
Grok 3
$6.600/M
Grok 2
$4.400/M
Grok 1.5
$8.000/M
DeepSeek-V3
$0.519/M
DeepSeek-V3-0324
$0.519/M
DeepSeek-R1
$1.042/M
BETA
Home
Feed
Insights
Index
Context
About
Subscribe
← All providers
·
AI model pricing index
DeepInfra
GPU serverless for popular open-weights models at aggressive prices. OpenAI-compatible API surface.
Founded
2022
HQ
Menlo Park, USA
Website ↗
Official pricing ↗
API docs ↗
MODELS TRACKED
11
3 categories
FLAGSHIP
Qwen2.5-72B (DI)
frontier
MIN INPUT
$0.060/M
cheapest model in family
AVG BLENDED
$0.459/M
across 11 priced models
MAX CONTEXT
128K
largest window in family
Frontier
1 model
Llama 3.1 405B (DI)
profile
frontier · 128K ctx
in
$0.800/M
out
$0.800/M
405B serverless
Reasoning
1 model
DeepSeek-R1 (DI)
profile
reasoning · 64K ctx
in
$0.550/M
out
$2.190/M
R1 managed
Efficient
9 models
Llama 3.3 70B (DI)
profile
efficient · 128K ctx
in
$0.230/M
out
$0.400/M
Cost-effective
Llama 3.1 70B (DI)
profile
efficient · 128K ctx
in
$0.350/M
out
$0.400/M
70B managed
Mistral 7B (DI)
profile
efficient · 32K ctx
in
$0.070/M
out
$0.070/M
Cheapest Mistral
Mixtral 8x7B (DI)
profile
efficient · 32K ctx
in
$0.240/M
out
$0.240/M
MoE managed
Qwen2.5-72B (DI)
profile
frontier · 128K ctx
in
$0.350/M
out
$0.400/M
Qwen on DeepInfra
Gemma 2 9B (DI)
profile
efficient · 8K ctx
in
$0.060/M
out
$0.060/M
Cheapest Gemma
WizardLM-2 8x22B (DI)
profile
frontier · 64K ctx
in
$0.630/M
out
$0.630/M
Instruction tuned
Phind-CodeLlama-34B (DI)
profile
efficient · 16K ctx
in
$0.600/M
out
$0.600/M
Code model
Yi-34B-Chat (DI)
profile
frontier · 4K ctx
in
$0.600/M
out
$0.600/M
Yi on DeepInfra