Skip to content

matdev83/llm-interactive-proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2,796 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Interactive Proxy

CI (dev) Coverage Tests passing Python License

Turn any compatible AI client into a safer, smarter, multi-provider agent platform.

LLM Interactive Proxy is a universal translation, routing, and control layer for modern AI clients. Point OpenAI-compatible apps, Anthropic tools, Gemini integrations, and agentic coding workflows at one local or shared endpoint, then gain routing, failover, built-in security, automated steering, session intelligence, observability, and cross-provider flexibility without rewriting your client.

If your current setup feels fragile, expensive, opaque, or locked to one vendor, this project is designed to change that.

It is a compatibility layer, a security layer, a traffic control plane, a debugging surface, and a workflow improver for serious agentic use.

  • Keep your existing clients - Change the endpoint, not the app.
  • Mix providers freely - Route across APIs, plans, OAuth accounts, model families, and protocol styles.
  • Control agents in production - Add guardrails, rewrites, diagnostics, and policy at the proxy layer.
  • Debug with evidence - Inspect exact wire traffic instead of guessing from symptoms.
Without the proxy With LLM Interactive Proxy
Each client is tied to one provider stack One endpoint can serve many clients and many backend families
Provider switching often means code or config churn Change routing instead of rewriting integrations
Agent safety is scattered across tools Centralize redaction, tool controls, sandboxing, and command protection
Debugging depends on incomplete logs Inspect exact wire traffic with captures and diagnostics
Token costs grow with long sessions Use intelligent context compression and smarter routing to reduce spend
Protocol mismatch blocks experimentation Use cross-protocol conversion to bridge Anthropic, OpenAI, Gemini, and more

Table Of Contents

At a glance

Beyond basic forwarding, the proxy adds cross-protocol translation, tool safety, routing and failover, session-oriented features (including B2BUA-style handling), byte-precise CBOR captures, and usage tracking. Longer narratives, use-case lists, and feature tours live in the User Guide.

Quick Start

1. Clone and install

git clone https://github.com/matdev83/llm-interactive-proxy.git
cd llm-interactive-proxy
python -m venv .venv

# Windows
.venv\Scripts\activate

# Linux/macOS
source .venv/bin/activate

python -m pip install -e .[dev]

If you want OAuth-oriented optional connectors, install the oauth extra:

python -m pip install -e .[dev,oauth]

2. Export at least one provider credential

# Example: OpenAI
export OPENAI_API_KEY="your-key-here"

3. Start the proxy

python -m src.core.cli --default-backend openai:gpt-4o

The proxy listens on http://localhost:8000 by default.

4. Point your client at the proxy instead of the vendor

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="dummy-key",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)

See the full Quick Start Guide for additional setup, auth, and backend examples.

Supported Frontend Interfaces

The proxy exposes standard API surfaces so existing clients can often work with little or no code changes.

  • OpenAI Chat Completions - /v1/chat/completions
  • OpenAI Responses - /v1/responses
  • OpenAI Models - /v1/models
  • Anthropic Messages - /anthropic/v1/messages
  • Dedicated Anthropic server - http://host:8001/v1/messages
  • Google Gemini v1beta - /v1beta/models and :generateContent
  • Diagnostics endpoint - /v1/diagnostics
  • Backend reactivation endpoint - /v1/diagnostics/backends/{backend_instance}/reactivate

See Frontend API documentation for protocol details and compatibility notes.

Supported Backends

The backend catalog keeps growing. Current documented backends include:

See the full Backends Overview for configuration and provider-specific notes.

Routing Selector Semantics

  • backend:model selects an explicit backend family.
  • backend-instance:model such as openai.1:gpt-4o targets a concrete backend instance.
  • model and vendor/model are model-only selectors.
  • vendor/model:variant remains model-only unless : appears before the first /.
  • URI-style parameters in selectors such as model?temperature=0.5 are parsed and propagated through routing metadata.
  • Explicit-backend configuration and command surfaces such as --static-route, replacement targets, and one-off routing require strict backend:model format.

Access Modes

The proxy supports two operational modes with different security assumptions:

  • Single User Mode - Default local-development mode with localhost-first behavior and support for OAuth connectors.
  • Multi User Mode - Shared or production mode with stronger authentication expectations and tighter connector rules.

Quick examples:

# Single User Mode
python -m src.core.cli

# Multi User Mode
python -m src.core.cli --multi-user-mode --host=0.0.0.0 --api-keys key1,key2

See Access Modes for the security model and deployment guidance.

Architecture

graph TD
    subgraph "Clients"
        A[OpenAI Client]
        B[OpenAI Responses Client]
        C[Anthropic Client]
        D[Gemini Client]
        E[Any LLM App or Agent]
    end

    subgraph "LLM Interactive Proxy"
        FE[Frontend APIs]
        Core[Routing Translation Safety Observability]
        BE[Backend Connectors]
        FE --> Core --> BE
    end

    subgraph "Providers"
        P1[OpenAI]
        P2[Anthropic]
        P3[Gemini]
        P4[OpenRouter]
        P5[Other Backends]
    end

    A --> FE
    B --> FE
    C --> FE
    D --> FE
    E --> FE
    BE --> P1
    BE --> P2
    BE --> P3
    BE --> P4
    BE --> P5
Loading

The proxy sits between the client and the provider, which is exactly why it can translate protocols, enforce policy, capture traffic, and route requests without forcing your app to change its calling pattern.

Documentation Map

Development

# Run the test suite
python -m pytest

# Lint and auto-fix
python -m ruff check --fix .

# Format
python -m black .

See the Development Guide for architecture, contribution workflow, and extra dev scripts.

Support

GitHub Issues and Discussions.

License

This project is licensed under the GNU AGPL v3.0 or later.