Skip to main content
Inference.net offers 15 models through Lava’s AI Gateway, supporting Chat Completions. Authentication uses Authorization: Bearer. See the Inference.net API docs for provider-specific parameters.
Supports both managed API keys (from Lava) and BYOK mode.

Quick Start

const response = await fetch('https://api.lavapayments.com/v1/forward?u=https%3A%2F%2Fapi.inference.net%2Fv1%2Fchat%2Fcompletions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${forwardToken}`,
  },
  body: JSON.stringify({
    model: 'deepseek/deepseek-r1/fp-8',
    messages: [{ role: "user", content: "Hello!" }],
  }),
});

Chat Completions

Target URL: https://api.inference.net/v1/chat/completions
Content Typeapplication/json
StreamingYes (set stream: true in request body)
ModelInput / 1M tokensOutput / 1M tokens
deepseek/deepseek-r1-0528/fp-8$0.50$3.00
deepseek/deepseek-r1/fp-8$0.45$2.70
deepseek/deepseek-v3-0324/fp-8$0.45$1.45
meta-llama/llama-3.1-70b-instruct/fp-16$0.30$0.40
meta-llama/llama-3.3-70b-instruct/fp-16$0.30$0.40
google/gemma-3-27b-instruct/bf-16$0.30$0.40
qwen/qwen2.5-7b-instruct/bf-16$0.20$0.20
deepseek/r1-distill-llama-70b/fp-8$0.10$0.40
qwen/qwen3-30b-a3b/fp8$0.08$0.29
meta-llama/llama-3.2-11b-instruct/fp-16$0.06$0.06
mistralai/mistral-nemo-12b-instruct/fp-8$0.04$0.10
meta-llama/llama-3.1-8b-instruct/fp-8$0.03$0.03
meta-llama/llama-3.1-8b-instruct/fp-16$0.02$0.03
meta-llama/llama-3.2-3b-instruct/fp-16$0.02$0.02
meta-llama/llama-3.2-1b-instruct/fp-16$0.01$0.01

Next Steps