RelayFreeLLM: Unified Open-Source Gateway for Free AI Model Providers with Automatic Failover
March 31, 2026
RelayFreeLLM is an open-source gateway that aggregates free AI model providers—Gemini, Groq, Mistral, Cerebras, Ollama—into a single OpenAI-compatible API to maximize free inference and provide automatic failover.
Usage examples demonstrate both automatic routing via a meta-model and direct routing to a specific provider or model, with sample commands for Python, curl, and LangChain integration.
Core components include a router, a dispatcher with retries, a selector for quota tracking, a provider registry, and provider-specific API clients such as gemini_client, groq_client, and mistral_client.
Quick-start steps guide users to clone the repository, install dependencies, add free API keys for Gemini, Groq, Mistral, Cerebras, and Ollama, verify connectivity, start the server, and access via standard OpenAI SDKs or curl.
Key benefits include automatic failover between providers, circuit breakers to quarantine faulty providers, quota management, real-time SSE streaming, and the option to mix cloud free tiers with local Ollama models for higher throughput.
Automatic failover behavior routes requests to other providers when one hits a rate limit, until a successful response is obtained (e.g., shifting from a rate-limited provider to Groq, Gemini, or Mistral).</
Integration requires no code changes; existing OpenAI SDK users can point to RelayFreeLLM’s endpoint at http://localhost:8000/v1 when running locally.
The project is built with FastAPI, Pydantic, and httpx, emphasizing self-hosted, free-tier AI access with optional integration of local Ollama models and plans to support more providers in the future.
The system offers an OpenAI-compatible API with routes for chat completions, models, and usage, plus routing logic (intent-based selection and automatic failover) to pick the best available provider or model.
A detailed project architecture diagram outlines the gateway components and a directory structure highlighting core source files and tests.
Summary based on 1 source
