Skip to main content
DeepInfra offers 47 models through Lava’s AI Gateway, supporting Chat Completions. Authentication uses Authorization: Bearer. See the DeepInfra API docs for provider-specific parameters.
Supports both managed API keys (from Lava) and BYOK mode.

Quick Start

const response = await fetch('https://api.lavapayments.com/v1/forward?u=https%3A%2F%2Fapi.deepinfra.com%2Fv1%2Fopenai%2Fchat%2Fcompletions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${forwardToken}`,
  },
  body: JSON.stringify({
    model: 'deepseek-ai/DeepSeek-R1',
    messages: [{ role: "user", content: "Hello!" }],
  }),
});

Chat Completions

Target URL: https://api.deepinfra.com/v1/openai/chat/completions
Content Typeapplication/json
StreamingYes (set stream: true in request body)
ModelInput / 1M tokensOutput / 1M tokens
deepseek-ai/DeepSeek-R1-Turbo$1.00$3.00
deepseek-ai/DeepSeek-R1-0528-Turbo$1.00$3.00
deepseek-ai/DeepSeek-V3-0324-Turbo$1.00$3.00
meta-llama/Meta-Llama-3.1-405B-Instruct$0.80$0.80
deepseek-ai/DeepSeek-R1$0.70$2.40
deepseek-ai/DeepSeek-R1-0528$0.50$2.15
deepseek-ai/DeepSeek-Prover-V2-671B$0.50$2.18
meta-llama/Llama-4-Maverick-17B-128E-Instruct-Turbo$0.50$0.50
microsoft/WizardLM-2-8x22B$0.48$0.48
deepseek-ai/DeepSeek-V3$0.38$0.89
meta-llama/Meta-Llama-3-70B-Instruct$0.30$0.40
MiniMaxAI/MiniMax-M2.5$0.30$1.20
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B$0.27$0.27
MiniMaxAI/MiniMax-M2.1$0.27$1.15
MiniMaxAI/MiniMax-M2$0.27$1.15
deepseek-ai/DeepSeek-V3-0324$0.25$0.88
meta-llama/Llama-3.3-70B-Instruct$0.23$0.40
meta-llama/Meta-Llama-3.1-70B-Instruct$0.23$0.40
deepseek-ai/DeepSeek-R1-Distill-Llama-70B$0.20$0.60
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8$0.15$0.60
Qwen/Qwen3-235B-A22B$0.13$0.60
Qwen/Qwen2.5-72B-Instruct$0.12$0.39
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo$0.10$0.28
Qwen/Qwen3-32B$0.10$0.30
google/gemma-3-27b-it$0.10$0.19
meta-llama/Llama-4-Scout-17B-16E-Instruct$0.08$0.30
Qwen/Qwen3-30B-A3B$0.08$0.29
mistralai/Mixtral-8x7B-Instruct-v0.1$0.08$0.24
Qwen/QwQ-32B$0.07$0.15
microsoft/phi-4$0.07$0.14
microsoft/phi-4-reasoning-plus$0.07$0.35
Gryphe/MythoMax-L2-13b$0.07$0.07
Qwen/Qwen3-14B$0.06$0.24
Qwen/Qwen2.5-Coder-32B-Instruct$0.06$0.15
meta-llama/Llama-Guard-4-12B$0.05$0.05
meta-llama/Llama-3.3-70B-Instruct-Turbo$0.05$0.17
google/gemma-3-12b-it$0.05$0.10
microsoft/Phi-4-multimodal-instruct$0.05$0.10
meta-llama/Llama-3.2-11B-Vision-Instruct$0.05$0.05
Qwen/Qwen2.5-7B-Instruct$0.04$0.10
meta-llama/Meta-Llama-3.1-8B-Instruct$0.03$0.05
meta-llama/Meta-Llama-3-8B-Instruct$0.03$0.06
mistralai/Mistral-7B-Instruct-v0.3$0.03$0.05
google/gemma-3-4b-it$0.02$0.04
meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo$0.02$0.02
meta-llama/Llama-3.2-1B-Instruct$0.0050$0.01
meta-llama/Llama-3.2-3B-Instruct$0.0030$0.0060

Next Steps