> ## Documentation Index
> Fetch the complete documentation index at: https://lava.so/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Nebius

> Nebius Token Factory provides cost-effective inference for open-source models from European GPU infrastructure.

Nebius offers 33 models through Lava's AI Gateway, supporting Chat Completions. Authentication uses `Authorization: Bearer`. See the [Nebius API docs](https://docs.tokenfactory.nebius.com/api-reference/introduction) for provider-specific parameters.

<Info>Supports both **managed** (Lava's API keys) and **unmanaged** (bring your own credentials) mode.</Info>

## Quick Start

```typescript theme={null}
const response = await fetch('https://api.lava.so/v1/forward?u=https%3A%2F%2Fapi.tokenfactory.nebius.com%2Fv1%2Fchat%2Fcompletions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${forwardToken}`,
  },
  body: JSON.stringify({
    model: 'nvidia/Llama-3_1-Nemotron-Ultra-253B-v1',
    messages: [{ role: "user", content: "Hello!" }],
  }),
});
```

## Chat Completions

**Target URL:** `https://api.tokenfactory.nebius.com/v1/chat/completions`

|                  |                                          |
| ---------------- | ---------------------------------------- |
| **Content Type** | `application/json`                       |
| **Streaming**    | Yes (set `stream: true` in request body) |

| Model                                    | Input / 1M tokens | Output / 1M tokens |
| ---------------------------------------- | ----------------- | ------------------ |
| deepseek-ai/DeepSeek-V4-Pro              | \$1.75            | \$3.50             |
| zai-org/GLM-5.1                          | \$1.40            | \$4.40             |
| NousResearch/Hermes-4-405B               | \$1.00            | \$3.00             |
| zai-org/GLM-5                            | \$1.00            | \$3.20             |
| nvidia/Nemotron-3-Ultra-550b-a55b        | \$1.00            | \$3.00             |
| moonshotai/Kimi-K2.6                     | \$0.95            | \$4.00             |
| openbmb/MiniCPM-V-4\_5                   | \$0.658           | \$1.11             |
| nvidia/Llama-3\_1-Nemotron-Ultra-253B-v1 | \$0.60            | \$1.80             |
| Qwen/Qwen3.5-397B-A17B-fast              | \$0.60            | \$3.60             |
| Qwen/Qwen3.5-397B-A17B                   | \$0.60            | \$3.60             |
| Qwen/Qwen3-235B-A22B-Thinking-2507-fast  | \$0.50            | \$2.00             |
| moonshotai/Kimi-K2.5-fast                | \$0.50            | \$2.50             |
| moonshotai/Kimi-K2.5                     | \$0.50            | \$2.50             |
| deepseek-ai/DeepSeek-V3.2-fast           | \$0.40            | \$2.00             |
| MiniMaxAI/MiniMax-M2.5-fast              | \$0.30            | \$1.20             |
| deepseek-ai/DeepSeek-V3.2                | \$0.30            | \$0.45             |
| nvidia/nemotron-3-super-120b-a12b        | \$0.30            | \$0.90             |
| MiniMaxAI/MiniMax-M2.5                   | \$0.30            | \$1.20             |
| Qwen/Qwen2.5-VL-72B-Instruct             | \$0.25            | \$0.75             |
| Qwen/Qwen3-235B-A22B-Instruct-2507       | \$0.20            | \$0.60             |
| PrimeIntellect/INTELLECT-3               | \$0.20            | \$1.10             |
| openai/gpt-oss-120b                      | \$0.15            | \$0.60             |
| Qwen/Qwen3-Next-80B-A3B-Thinking         | \$0.15            | \$1.20             |
| Qwen/Qwen3-Next-80B-A3B-Thinking-fast    | \$0.15            | \$1.20             |
| meta-llama/Llama-3.3-70B-Instruct        | \$0.13            | \$0.40             |
| NousResearch/Hermes-4-70B                | \$0.13            | \$0.40             |
| Qwen/Qwen3-32B                           | \$0.10            | \$0.30             |
| google/gemma-3-27b-it                    | \$0.10            | \$0.30             |
| Qwen/Qwen3-30B-A3B-Instruct-2507         | \$0.10            | \$0.30             |
| nvidia/Cosmos3-Super-Reasoner            | \$0.10            | \$0.30             |
| openai/gpt-oss-120b-fast                 | \$0.10            | \$0.50             |
| nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B    | \$0.06            | \$0.24             |
| nvidia/Nemotron-3-Nano-Omni              | \$0.06            | \$0.24             |

## Next Steps

<CardGroup cols={2}>
  <Card title="All Providers" icon="grid" href="/gateway/supported-providers">
    Browse all supported AI providers
  </Card>

  <Card title="Forward Proxy" icon="route" href="/gateway/forward-proxy">
    Learn how to construct proxy URLs and authenticate requests
  </Card>
</CardGroup>
