Skip to main content

Quick Reference

  • Base URL: https://generativelanguage.googleapis.com/v1beta
  • Authentication: Bearer token (OpenAI endpoint) OR x-goog-api-key (native endpoint)
  • API Format: Dual support - OpenAI-compatible AND native Google format
  • Usage Tracking: data.usage (OpenAI endpoint) OR data.usageMetadata (native)
  • BYOK Support: ✓ Supported

Current Models (October 2025)

Gemini 2.5 Series (Latest)

  • gemini-2.5-pro - State-of-the-art thinking model with 1M token context
  • gemini-2.5-flash - Best price/performance balance, 1M context
  • gemini-2.5-flash-lite - Fastest, most cost-efficient option
  • gemini-2.5-flash-image - Image generation support
  • gemini-2.5-pro-preview-tts - Text-to-speech capabilities
  • gemini-2.5-flash-preview-tts - TTS (faster variant)

Gemini 2.0 Series

  • gemini-2.0-flash - Second generation workhorse model
  • gemini-2.0-flash-lite - Small, efficient workhorse
  • gemini-2.0-flash-preview-image-generation - Image generation

Integration Example

Google Gemini supports two API formats: OpenAI-compatible (recommended for simplicity) and native Google format (for advanced features).

Prerequisites

  1. Get your Lava forward token:
  2. Set up environment variables:
.env.local
LAVA_BASE_URL=https://api.lavapayments.com/v1
LAVA_FORWARD_TOKEN=your_forward_token_from_dashboard
  1. Run from backend server (CORS blocks frontend requests for security)
/**
 * Google Gemini Chat Completion via Lava (OpenAI-Compatible)
 *
 * Benefits: Standard OpenAI format, easier integration
 * Endpoint: /v1beta/openai/chat/completions
 * Auth: Bearer token
 */

require('dotenv').config({ path: '.env.local' });

async function callGeminiViaLava() {
  // 1. Define the Google OpenAI-compatible endpoint
  const PROVIDER_ENDPOINT = 'https://generativelanguage.googleapis.com/v1beta/openai/chat/completions';

  // 2. Build the Lava forward proxy URL
  const url = `${process.env.LAVA_BASE_URL}/forward?u=${PROVIDER_ENDPOINT}`;

  // 3. Set up authentication headers
  const headers = {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${process.env.LAVA_FORWARD_TOKEN}`
  };

  // 4. Define the request body (standard OpenAI format)
  const requestBody = {
    model: 'gemini-2.5-flash',  // Use any Gemini model
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'Explain quantum computing in one sentence.' }
    ],
    temperature: 0.7,
    max_tokens: 1024
  };

  // 5. Make the request
  try {
    const response = await fetch(url, {
      method: 'POST',
      headers: headers,
      body: JSON.stringify(requestBody)
    });

    // 6. Parse the response
    const data = await response.json();

    // 7. Extract usage data (standard OpenAI format)
    const usage = data.usage;
    console.log('\nUsage Tracking:');
    console.log(`  Prompt tokens: ${usage.prompt_tokens}`);
    console.log(`  Completion tokens: ${usage.completion_tokens}`);
    console.log(`  Total tokens: ${usage.total_tokens}`);

    // 8. Extract request ID (from response header)
    const requestId = response.headers.get('x-lava-request-id');
    console.log(`\nLava Request ID: ${requestId}`);
    console.log('  (Use this ID to find the request in your dashboard)');

    // 9. Display the AI response
    console.log('\nAI Response:');
    console.log(data.choices[0].message.content);

    return data;
  } catch (error) {
    console.error('Error calling Gemini via Lava:', error.message);
    throw error;
  }
}

// Run the example
callGeminiViaLava();

Option 2: Native Google Format

/**
 * Google Gemini Native API via Lava
 *
 * Benefits: Access to advanced Google-specific features
 * Endpoint: /v1beta/models/{model}:generateContent
 * Auth: x-goog-api-key header
 */

require('dotenv').config({ path: '.env.local' });

async function callGeminiNativeViaLava() {
  // 1. Define the Google native endpoint (model in URL path)
  const model = 'gemini-2.5-flash';
  const PROVIDER_ENDPOINT = `https://generativelanguage.googleapis.com/v1beta/models/${model}:generateContent`;

  // 2. Build the Lava forward proxy URL
  const url = `${process.env.LAVA_BASE_URL}/forward?u=${PROVIDER_ENDPOINT}`;

  // 3. Set up authentication headers (Google-specific)
  const headers = {
    'Content-Type': 'application/json',
    'x-goog-api-key': process.env.LAVA_FORWARD_TOKEN  // Note: x-goog-api-key header
  };

  // 4. Define the request body (Google native format)
  const requestBody = {
    contents: [{
      role: 'user',
      parts: [{ text: 'Explain quantum computing in one sentence.' }]
    }],
    generationConfig: {
      temperature: 0.7,
      maxOutputTokens: 1024
    }
  };

  // 5. Make the request
  try {
    const response = await fetch(url, {
      method: 'POST',
      headers: headers,
      body: JSON.stringify(requestBody)
    });

    // 6. Parse the response
    const data = await response.json();

    // 7. Extract usage data (Google native format)
    const usage = data.usageMetadata;
    console.log('\nUsage Tracking:');
    console.log(`  Prompt tokens: ${usage.promptTokenCount}`);
    console.log(`  Candidates tokens: ${usage.candidatesTokenCount}`);
    console.log(`  Total tokens: ${usage.totalTokenCount}`);

    // 8. Extract request ID (from response header)
    const requestId = response.headers.get('x-lava-request-id');
    console.log(`\nLava Request ID: ${requestId}`);

    // 9. Display the AI response (Google native structure)
    console.log('\nAI Response:');
    console.log(data.candidates[0].content.parts[0].text);

    return data;
  } catch (error) {
    console.error('Error calling Gemini native API via Lava:', error.message);
    throw error;
  }
}

// Run the example
callGeminiNativeViaLava();

Request/Response Formats

OpenAI-Compatible Format

Request:
{
  "model": "gemini-2.5-flash",
  "messages": [
    { "role": "system", "content": "You are helpful." },
    { "role": "user", "content": "Hello!" }
  ],
  "temperature": 0.7,
  "max_tokens": 1024
}
Response:
{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "gemini-2.5-flash",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 8,
    "total_tokens": 18
  }
}

Native Google Format

Request:
{
  "contents": [{
    "role": "user",
    "parts": [{ "text": "Hello, Gemini!" }]
  }],
  "generationConfig": {
    "temperature": 0.7,
    "maxOutputTokens": 1024
  }
}
Response:
{
  "candidates": [{
    "content": {
      "parts": [{ "text": "Hello! How can I help you?" }],
      "role": "model"
    },
    "finishReason": "STOP"
  }],
  "usageMetadata": {
    "promptTokenCount": 5,
    "candidatesTokenCount": 9,
    "totalTokenCount": 14,
    "thoughtsTokenCount": 0
  }
}

Key Features

Multi-Modal Capabilities

  • Text, images, video, audio: Send multiple content types in single request
  • 1M token context: Process extremely long documents (Gemini 2.5 Pro/Flash)
  • Vision understanding: Analyze images and diagrams

Advanced Features

  • Thinking mode: Extended reasoning for complex tasks
  • Code execution: Built-in code interpreter for mathematical computations
  • Google Search grounding: Real-time web search integration
  • Google Maps grounding: Location-based context and information

Context and Limits

  • Input: Up to 1,048,576 tokens (1M context window)
  • Output: 8,192 to 65,536 tokens (model-dependent)

Usage Tracking

OpenAI-Compatible Endpoint

Usage data is available in the response body at data.usage:
{
  "usage": {
    "prompt_tokens": 100,
    "completion_tokens": 50,
    "total_tokens": 150
  }
}
Access in code:
const usage = data.usage;
console.log(`Total tokens: ${usage.total_tokens}`);

Native Endpoint

Usage data is available at data.usageMetadata:
{
  "usageMetadata": {
    "promptTokenCount": 100,
    "candidatesTokenCount": 50,
    "totalTokenCount": 150,
    "thoughtsTokenCount": 0  // Reasoning tokens (if using thinking mode)
  }
}
Access in code:
const usage = data.usageMetadata;
console.log(`Total tokens: ${usage.totalTokenCount}`);

BYOK Support

Google Gemini fully supports Bring Your Own Key (BYOK) mode. Your forward token format:
${LAVA_SECRET_KEY}.${CONNECTION_SECRET}.${PRODUCT_SECRET}.${YOUR_GOOGLE_API_KEY}
Note: When using BYOK, Lava meters usage but does not charge your Lava wallet. Costs are billed directly to your Google Cloud account. For detailed BYOK setup, see the BYOK guide.

Official Documentation