Skip to main content
DeepInfra offers 56 models through Lava’s AI Gateway, supporting Chat Completions. Authentication uses Authorization: Bearer. See the DeepInfra API docs for provider-specific parameters.
Supports both managed (Lava’s API keys) and unmanaged (bring your own credentials) mode.

Quick Start

const response = await fetch('https://api.lava.so/v1/forward?u=https%3A%2F%2Fapi.deepinfra.com%2Fv1%2Fopenai%2Fchat%2Fcompletions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${forwardToken}`,
  },
  body: JSON.stringify({
    model: 'deepseek-ai/DeepSeek-V3.1',
    messages: [{ role: "user", content: "Hello!" }],
  }),
});

Chat Completions

Target URL: https://api.deepinfra.com/v1/openai/chat/completions
Content Typeapplication/json
StreamingYes (set stream: true in request body)
ModelInput / 1M tokensOutput / 1M tokens
deepseek-ai/DeepSeek-R1-Turbo$1.00$3.00
deepseek-ai/DeepSeek-R1-0528-Turbo$1.00$3.00
deepseek-ai/DeepSeek-V3-0324-Turbo$1.00$3.00
meta-llama/Meta-Llama-3.1-405B-Instruct$0.80$0.80
deepseek-ai/DeepSeek-R1$0.70$2.40
deepseek-ai/DeepSeek-R1-0528$0.50$2.15
deepseek-ai/DeepSeek-Prover-V2-671B$0.50$2.18
meta-llama/Llama-4-Maverick-17B-128E-Instruct-Turbo$0.50$0.50
microsoft/WizardLM-2-8x22B$0.48$0.48
Qwen/Qwen3-Coder-480B-A35B-Instruct$0.40$1.60
deepseek-ai/DeepSeek-V3$0.38$0.89
meta-llama/Meta-Llama-3-70B-Instruct$0.30$0.40
MiniMaxAI/MiniMax-M2.5$0.30$1.20
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B$0.27$0.27
MiniMaxAI/MiniMax-M2.1$0.27$1.15
MiniMaxAI/MiniMax-M2$0.27$1.15
deepseek-ai/DeepSeek-V3-0324$0.25$0.88
Qwen/Qwen3-235B-A22B-Thinking-2507$0.23$2.30
meta-llama/Llama-3.3-70B-Instruct$0.23$0.40
meta-llama/Meta-Llama-3.1-70B-Instruct$0.23$0.40
Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo$0.22$1.00
deepseek-ai/DeepSeek-V3.1$0.21$0.79
deepseek-ai/DeepSeek-V3.1-Terminus$0.21$0.79
deepseek-ai/DeepSeek-R1-Distill-Llama-70B$0.20$0.60
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8$0.15$0.60
Qwen/Qwen3-235B-A22B$0.13$0.60
Qwen/Qwen2.5-72B-Instruct$0.12$0.39
nvidia/Llama-3.3-Nemotron-Super-49B-v1.5$0.10$0.40
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo$0.10$0.28
Qwen/Qwen3-32B$0.10$0.30
google/gemma-3-27b-it$0.10$0.19
Qwen/Qwen3-Next-80B-A3B-Instruct$0.09$1.10
meta-llama/Llama-4-Scout-17B-16E-Instruct$0.08$0.30
Qwen/Qwen3-30B-A3B$0.08$0.29
mistralai/Mixtral-8x7B-Instruct-v0.1$0.08$0.24
Qwen/QwQ-32B$0.075$0.15
Qwen/Qwen3-235B-A22B-Instruct-2507$0.071$0.10
microsoft/phi-4$0.07$0.14
microsoft/phi-4-reasoning-plus$0.07$0.35
Gryphe/MythoMax-L2-13b$0.065$0.065
Qwen/Qwen3-14B$0.06$0.24
Qwen/Qwen2.5-Coder-32B-Instruct$0.06$0.15
meta-llama/Llama-Guard-4-12B$0.05$0.05
meta-llama/Llama-3.3-70B-Instruct-Turbo$0.05$0.17
google/gemma-3-12b-it$0.05$0.10
microsoft/Phi-4-multimodal-instruct$0.05$0.10
meta-llama/Llama-3.2-11B-Vision-Instruct$0.049$0.049
nvidia/NVIDIA-Nemotron-Nano-9B-v2$0.04$0.16
Qwen/Qwen2.5-7B-Instruct$0.04$0.10
meta-llama/Meta-Llama-3.1-8B-Instruct$0.03$0.05
meta-llama/Meta-Llama-3-8B-Instruct$0.03$0.06
mistralai/Mistral-7B-Instruct-v0.3$0.028$0.054
google/gemma-3-4b-it$0.02$0.04
meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo$0.016$0.021
meta-llama/Llama-3.2-1B-Instruct$0.005$0.01
meta-llama/Llama-3.2-3B-Instruct$0.003$0.006

Next Steps

All Providers

Browse all supported AI providers

Forward Proxy

Learn how to construct proxy URLs and authenticate requests