> ## Documentation Index
> Fetch the complete documentation index at: https://lava.so/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# DeepInfra

> DeepInfra hosts open-source models on managed infrastructure with pay-per-token pricing and no minimum commitments.

DeepInfra offers 52 models through Lava's AI Gateway, supporting Chat Completions. Authentication uses `Authorization: Bearer`. See the [DeepInfra API docs](https://deepinfra.com/docs) for provider-specific parameters.

<Info>Supports both **managed** (Lava's API keys) and **unmanaged** (bring your own credentials) mode.</Info>

## Quick Start

```typescript theme={null}
const response = await fetch('https://api.lava.so/v1/forward?u=https%3A%2F%2Fapi.deepinfra.com%2Fv1%2Fopenai%2Fchat%2Fcompletions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${forwardToken}`,
  },
  body: JSON.stringify({
    model: 'zai-org/GLM-5.2',
    messages: [{ role: "user", content: "Hello!" }],
  }),
});
```

## Chat Completions

**Target URL:** `https://api.deepinfra.com/v1/openai/chat/completions`

|                  |                                          |
| ---------------- | ---------------------------------------- |
| **Content Type** | `application/json`                       |
| **Streaming**    | Yes (set `stream: true` in request body) |

| Model                                               | Input / 1M tokens | Output / 1M tokens |
| --------------------------------------------------- | ----------------- | ------------------ |
| zai-org/GLM-5.2                                     | \$1.40            | \$4.40             |
| deepseek-ai/DeepSeek-R1-Turbo                       | \$1.00            | \$3.00             |
| deepseek-ai/DeepSeek-R1-0528-Turbo                  | \$1.00            | \$3.00             |
| deepseek-ai/DeepSeek-V3-0324-Turbo                  | \$1.00            | \$3.00             |
| meta-llama/Meta-Llama-3.1-405B-Instruct             | \$0.80            | \$0.80             |
| deepseek-ai/DeepSeek-R1                             | \$0.70            | \$2.40             |
| deepseek-ai/DeepSeek-R1-0528                        | \$0.50            | \$2.15             |
| deepseek-ai/DeepSeek-Prover-V2-671B                 | \$0.50            | \$2.18             |
| meta-llama/Llama-4-Maverick-17B-128E-Instruct-Turbo | \$0.50            | \$0.50             |
| microsoft/WizardLM-2-8x22B                          | \$0.48            | \$0.48             |
| Qwen/Qwen3-Coder-480B-A35B-Instruct                 | \$0.40            | \$1.60             |
| Gryphe/MythoMax-L2-13b                              | \$0.40            | \$0.40             |
| Qwen/Qwen2.5-72B-Instruct                           | \$0.36            | \$0.40             |
| meta-llama/Llama-3.2-11B-Vision-Instruct            | \$0.345           | \$0.345            |
| deepseek-ai/DeepSeek-V3                             | \$0.32            | \$0.89             |
| Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo           | \$0.30            | \$1.00             |
| meta-llama/Meta-Llama-3-70B-Instruct                | \$0.30            | \$0.40             |
| MiniMaxAI/MiniMax-M2.5                              | \$0.30            | \$1.20             |
| deepseek-ai/DeepSeek-V3.1-Terminus                  | \$0.27            | \$0.95             |
| deepseek-ai/DeepSeek-R1-Distill-Qwen-32B            | \$0.27            | \$0.27             |
| MiniMaxAI/MiniMax-M2.1                              | \$0.27            | \$1.15             |
| MiniMaxAI/MiniMax-M2                                | \$0.27            | \$1.15             |
| Qwen/Qwen3-235B-A22B-Thinking-2507                  | \$0.23            | \$2.30             |
| meta-llama/Llama-3.3-70B-Instruct                   | \$0.23            | \$0.40             |
| deepseek-ai/DeepSeek-V3.1                           | \$0.21            | \$0.79             |
| deepseek-ai/DeepSeek-V3-0324                        | \$0.20            | \$0.77             |
| deepseek-ai/DeepSeek-R1-Distill-Llama-70B           | \$0.20            | \$0.60             |
| meta-llama/Llama-Guard-4-12B                        | \$0.18            | \$0.18             |
| meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8   | \$0.15            | \$0.60             |
| Qwen/Qwen3-30B-A3B                                  | \$0.12            | \$0.50             |
| Qwen/Qwen3-14B                                      | \$0.12            | \$0.24             |
| nvidia/Llama-3.3-Nemotron-Super-49B-v1.5            | \$0.10            | \$0.40             |
| meta-llama/Llama-3.3-70B-Instruct-Turbo             | \$0.10            | \$0.32             |
| meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo        | \$0.10            | \$0.28             |
| Qwen/Qwen3-Next-80B-A3B-Instruct                    | \$0.09            | \$1.10             |
| Qwen/Qwen3-235B-A22B-Instruct-2507                  | \$0.09            | \$0.10             |
| meta-llama/Llama-4-Scout-17B-16E-Instruct           | \$0.08            | \$0.30             |
| Qwen/Qwen3-32B                                      | \$0.08            | \$0.28             |
| google/gemma-3-27b-it                               | \$0.08            | \$0.16             |
| microsoft/phi-4                                     | \$0.07            | \$0.14             |
| microsoft/phi-4-reasoning-plus                      | \$0.07            | \$0.35             |
| Qwen/Qwen2.5-Coder-32B-Instruct                     | \$0.06            | \$0.15             |
| google/gemma-3-12b-it                               | \$0.05            | \$0.10             |
| google/gemma-3-4b-it                                | \$0.05            | \$0.10             |
| microsoft/Phi-4-multimodal-instruct                 | \$0.05            | \$0.10             |
| nvidia/NVIDIA-Nemotron-Nano-9B-v2                   | \$0.04            | \$0.16             |
| Qwen/Qwen2.5-7B-Instruct                            | \$0.04            | \$0.10             |
| meta-llama/Meta-Llama-3.1-8B-Instruct               | \$0.03            | \$0.05             |
| meta-llama/Meta-Llama-3-8B-Instruct                 | \$0.03            | \$0.06             |
| mistralai/Mistral-7B-Instruct-v0.3                  | \$0.028           | \$0.054            |
| meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo         | \$0.02            | \$0.03             |
| meta-llama/Llama-3.2-1B-Instruct                    | \$0.005           | \$0.01             |

## Next Steps

<CardGroup cols={2}>
  <Card title="All Providers" icon="grid" href="/gateway/supported-providers">
    Browse all supported AI providers
  </Card>

  <Card title="Forward Proxy" icon="route" href="/gateway/forward-proxy">
    Learn how to construct proxy URLs and authenticate requests
  </Card>
</CardGroup>