DeepInfra

DeepInfra offers 52 models through Lava’s AI Gateway, supporting Chat Completions. Authentication uses Authorization: Bearer. See the DeepInfra API docs for provider-specific parameters.

Supports both managed (Lava’s API keys) and unmanaged (bring your own credentials) mode.

Quick Start

const response = await fetch('https://api.lava.so/v1/forward?u=https%3A%2F%2Fapi.deepinfra.com%2Fv1%2Fopenai%2Fchat%2Fcompletions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${forwardToken}`,
  },
  body: JSON.stringify({
    model: 'zai-org/GLM-5.2',
    messages: [{ role: "user", content: "Hello!" }],
  }),
});

Chat Completions

Target URL: https://api.deepinfra.com/v1/openai/chat/completions


Content Type	`application/json`
Streaming	Yes (set `stream: true` in request body)

Model	Input / 1M tokens	Output / 1M tokens
zai-org/GLM-5.2	$1.40	$4.40
deepseek-ai/DeepSeek-R1-Turbo	$1.00	$3.00
deepseek-ai/DeepSeek-R1-0528-Turbo	$1.00	$3.00
deepseek-ai/DeepSeek-V3-0324-Turbo	$1.00	$3.00
meta-llama/Meta-Llama-3.1-405B-Instruct	$0.80	$0.80
deepseek-ai/DeepSeek-R1	$0.70	$2.40
deepseek-ai/DeepSeek-R1-0528	$0.50	$2.15
deepseek-ai/DeepSeek-Prover-V2-671B	$0.50	$2.18
meta-llama/Llama-4-Maverick-17B-128E-Instruct-Turbo	$0.50	$0.50
microsoft/WizardLM-2-8x22B	$0.48	$0.48
Qwen/Qwen3-Coder-480B-A35B-Instruct	$0.40	$1.60
Gryphe/MythoMax-L2-13b	$0.40	$0.40
Qwen/Qwen2.5-72B-Instruct	$0.36	$0.40
meta-llama/Llama-3.2-11B-Vision-Instruct	$0.345	$0.345
deepseek-ai/DeepSeek-V3	$0.32	$0.89
Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo	$0.30	$1.00
meta-llama/Meta-Llama-3-70B-Instruct	$0.30	$0.40
MiniMaxAI/MiniMax-M2.5	$0.30	$1.20
deepseek-ai/DeepSeek-V3.1-Terminus	$0.27	$0.95
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B	$0.27	$0.27
MiniMaxAI/MiniMax-M2.1	$0.27	$1.15
MiniMaxAI/MiniMax-M2	$0.27	$1.15
Qwen/Qwen3-235B-A22B-Thinking-2507	$0.23	$2.30
meta-llama/Llama-3.3-70B-Instruct	$0.23	$0.40
deepseek-ai/DeepSeek-V3.1	$0.21	$0.79
deepseek-ai/DeepSeek-V3-0324	$0.20	$0.77
deepseek-ai/DeepSeek-R1-Distill-Llama-70B	$0.20	$0.60
meta-llama/Llama-Guard-4-12B	$0.18	$0.18
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8	$0.15	$0.60
Qwen/Qwen3-30B-A3B	$0.12	$0.50
Qwen/Qwen3-14B	$0.12	$0.24
nvidia/Llama-3.3-Nemotron-Super-49B-v1.5	$0.10	$0.40
meta-llama/Llama-3.3-70B-Instruct-Turbo	$0.10	$0.32
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo	$0.10	$0.28
Qwen/Qwen3-Next-80B-A3B-Instruct	$0.09	$1.10
Qwen/Qwen3-235B-A22B-Instruct-2507	$0.09	$0.10
meta-llama/Llama-4-Scout-17B-16E-Instruct	$0.08	$0.30
Qwen/Qwen3-32B	$0.08	$0.28
google/gemma-3-27b-it	$0.08	$0.16
microsoft/phi-4	$0.07	$0.14
microsoft/phi-4-reasoning-plus	$0.07	$0.35
Qwen/Qwen2.5-Coder-32B-Instruct	$0.06	$0.15
google/gemma-3-12b-it	$0.05	$0.10
google/gemma-3-4b-it	$0.05	$0.10
microsoft/Phi-4-multimodal-instruct	$0.05	$0.10
nvidia/NVIDIA-Nemotron-Nano-9B-v2	$0.04	$0.16
Qwen/Qwen2.5-7B-Instruct	$0.04	$0.10
meta-llama/Meta-Llama-3.1-8B-Instruct	$0.03	$0.05
meta-llama/Meta-Llama-3-8B-Instruct	$0.03	$0.06
mistralai/Mistral-7B-Instruct-v0.3	$0.028	$0.054
meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo	$0.02	$0.03
meta-llama/Llama-3.2-1B-Instruct	$0.005	$0.01

Quick Start

Chat Completions

Next Steps

All Providers

Forward Proxy

​Quick Start

​Chat Completions

​Next Steps

All Providers

Forward Proxy

Quick Start

Chat Completions

Next Steps