Skip to content

Remote Providers

You don’t need a local GPU to participate. Any OpenAI-compatible cloud API can be used as your LLM endpoint — OpenRouter, Together AI, Groq, Fireworks, or even the OpenAI API itself.

When you join a room with a remote provider, the hub forwards requests from other participants through your cloud endpoint. Two protocols are supported:

ProtocolEndpointBest For
Responses API/v1/responsesOpenAI, providers that support the Responses API
Chat Completions/v1/chat/completionsMost providers (Ollama, LM Studio, OpenRouter, etc.)

The hub auto-detects which protocol(s) your endpoint supports when you join.

For automation, prefer --header-env so secrets don’t end up in shell history.

Terminal window
export OPENROUTER_AUTH="Bearer sk-or-..."
gambi join --code ABC123 \
--endpoint https://openrouter.ai/api \
--model meta-llama/llama-3.1-8b-instruct:free \
--nickname my-openrouter \
--header-env Authorization=OPENROUTER_AUTH

You can also send extra provider-specific headers:

Terminal window
gambi join --code ABC123 \
--endpoint https://openrouter.ai/api \
--model meta-llama/llama-3.1-8b-instruct:free \
--nickname my-openrouter \
--header-env Authorization=OPENROUTER_AUTH \
--header "HTTP-Referer=https://my-app.example"

If you prefer interactive mode, gambi join now prompts for auth headers after you choose the endpoint. Header values are collected via hidden prompts.

The TUI keeps remote auth secure by referencing environment variables instead of storing raw secret values in the UI state.

  1. Export the header value you want to use:
Terminal window
export OPENAI_AUTH="Bearer sk-..."
  1. Open the TUI join flow.
  2. Expand Advanced options.
  3. Add an auth header entry such as:
    • Header name: Authorization
    • Env var: OPENAI_AUTH

The TUI resolves the environment variable locally before probing models and joining the room. The raw secret is never persisted to Gambi config files.

POST /rooms/:code/join now accepts an optional authHeaders object. Those headers are stored only in hub memory and are used for:

  • endpoint probing (/v1/models, /v1/responses, /v1/chat/completions)
  • proxied inference requests
  • Responses lifecycle routes

They are not returned by GET /rooms/:code/participants, GET /rooms/:code/v1/models, or join responses.

ProviderBase URLFree Models?
OpenRouterhttps://openrouter.ai/apiYes (:free suffix)
Together AIhttps://api.together.xyzFree tier
Groqhttps://api.groq.com/openaiFree tier
Fireworkshttps://api.fireworks.ai/inferenceFree tier
OpenAIhttps://api.openai.comNo

When you join with a cloud provider:

  • Your API key is used for every request routed to you
  • You pay for the tokens consumed
  • Other participants don’t see your API key — the hub keeps headers only in memory and does not expose them in participant listings

Choose a model you’re comfortable paying for, or use free-tier models for experimentation.

  • Gambi sends authHeaders from the joining client to the hub, then stores them in memory for as long as that participant is registered.
  • In trusted local networks, that’s usually enough.
  • If your hub is reachable outside your LAN, put it behind HTTPS or a reverse proxy before sending provider credentials through it.
  • Responses API — newer, simpler input format ("input": "text"), supports multi-turn via previous_response_id. Use if your provider supports it.
  • Chat Completions — widely supported, familiar messages array format. Works with virtually every provider.

The hub handles both transparently. When a request comes in via one protocol and the participant only supports the other, the hub adapts automatically.