Authorization: Bearer. See the Nebius API docs for provider-specific parameters.
Supports both managed (Lava’s API keys) and unmanaged (bring your own credentials) mode.
Quick Start
Chat Completions
Target URL:https://api.tokenfactory.nebius.com/v1/chat/completions
| Content Type | application/json |
| Streaming | Yes (set stream: true in request body) |
| Model | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| deepseek-ai/DeepSeek-V3-0324-fast | $2.00 | $6.00 |
| deepseek-ai/DeepSeek-R1-fast | $2.00 | $6.00 |
| meta-llama/Meta-Llama-3.1-405B-Instruct | $1.00 | $3.00 |
| NousResearch/Hermes-3-Llama-405B | $1.00 | $3.00 |
| deepseek-ai/DeepSeek-R1-0528 | $0.80 | $2.40 |
| deepseek-ai/DeepSeek-R1 | $0.80 | $2.40 |
| nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 | $0.60 | $1.80 |
| deepseek-ai/DeepSeek-V3-0324 | $0.50 | $1.50 |
| deepseek-ai/DeepSeek-V3 | $0.50 | $1.50 |
| Qwen/QwQ-32B-fast | $0.50 | $1.50 |
| Qwen/Qwen3-30B-A3B-fast | $0.30 | $0.90 |
| meta-llama/Llama-3.3-70B-Instruct-fast | $0.25 | $0.75 |
| Qwen/Qwen2.5-72B-Instruct-fast | $0.25 | $0.75 |
| deepseek-ai/DeepSeek-R1-Distill-Llama-70B | $0.25 | $0.75 |
| Qwen/Qwen3-235B-A22B | $0.20 | $0.60 |
| Qwen/Qwen3-32B-fast | $0.20 | $0.60 |
| Qwen/QwQ-32B | $0.15 | $0.45 |
| meta-llama/Llama-3.3-70B-Instruct | $0.13 | $0.40 |
| meta-llama/Meta-Llama-3.1-70B-Instruct | $0.13 | $0.40 |
| Qwen/Qwen2.5-32B-Instruct-fast | $0.13 | $0.40 |
| Qwen/Qwen2.5-72B-Instruct | $0.13 | $0.40 |
| aaditya/Llama3-OpenBioLLM-70B | $0.13 | $0.40 |
| nvidia/Llama-3_3-Nemotron-Super-49B-v1 | $0.13 | $0.40 |
| Qwen/Qwen3-30B-A3B | $0.10 | $0.30 |
| Qwen/Qwen3-32B | $0.10 | $0.30 |
| Qwen/Qwen2.5-Coder-32B-Instruct-fast | $0.10 | $0.30 |
| microsoft/phi-4 | $0.10 | $0.30 |
| Qwen/Qwen3-14B | $0.08 | $0.24 |
| Qwen/Qwen3-4B-fast | $0.08 | $0.24 |
| mistralai/Devstral-Small-2505 | $0.08 | $0.24 |
| Qwen/Qwen2.5-Coder-32B-Instruct | $0.06 | $0.18 |
| Qwen/Qwen2.5-32B-Instruct | $0.06 | $0.20 |
| mistralai/Mistral-Nemo-Instruct-2407 | $0.04 | $0.12 |
| meta-llama/Meta-Llama-3.1-8B-Instruct-fast | $0.03 | $0.09 |
| Qwen/Qwen2.5-Coder-7B-fast | $0.03 | $0.09 |
| google/gemma-2-9b-it-fast | $0.03 | $0.09 |
| meta-llama/Meta-Llama-3.1-8B-Instruct | $0.02 | $0.06 |
| google/gemma-2-2b-it | $0.02 | $0.06 |
| Qwen/Qwen2.5-Coder-7B | $0.01 | $0.03 |
Next Steps
All Providers
Browse all supported AI providers
Forward Proxy
Learn how to construct proxy URLs and authenticate requests