feat: auto-launching local proxy server for eversale LLM routing#219
Draft
codegen-sh[bot] wants to merge 2 commits intodevelopfrom
Draft
feat: auto-launching local proxy server for eversale LLM routing#219codegen-sh[bot] wants to merge 2 commits intodevelopfrom
codegen-sh[bot] wants to merge 2 commits intodevelopfrom
Conversation
Modified 7 files from eversale-cli-2.1.216 to run locally with custom API key: 1. engine/config/config.yaml - mode=local, all models=glm-5, endpoints=Z.AI 2. engine/agent/gpu_llm_client.py - URL=ANTHROPIC_BASE_URL, auth=ANTHROPIC_API_KEY 3. engine/agent/llm_fallback_chain.py - defaults to env vars 4. engine/agent/kimi_k2_client.py - added anthropic provider (auto-detect first) 5. bin/eversale.js - license check bypassed for local dev 6. engine/agent/license_validator.py - validate functions return True 7. engine/agent/config_loader.py - ANTHROPIC_BASE_URL in env chain Verification: 30/30 tests pass including live API call to Z.AI with GLM-5.
- proxy_server.py: aiohttp gateway with 4 backend adapters (Anthropic, OpenAI, Ollama, Custom/Z.AI) Endpoints: /v1/chat/completions, /v1/models, /api/chat, /health Full SSE streaming support for all backends - proxy_launcher.py: subprocess lifecycle management Idempotent startup with health-check gating PID file management, graceful shutdown, CLI interface Cross-platform: Unix (SIGTERM) + Windows (TASKKILL) - proxy_config.py: centralized configuration Backend selection via LLM_BACKEND env var Model name mapping (glm-5 → actual model per backend) API key routing from ANTHROPIC_API_KEY, OPENAI_API_KEY, etc. - gpu_llm_client.py: default URL to localhost:8765 proxy Added _is_local_proxy() detection Proxy handles auth internally (no token from client) - llm_fallback_chain.py: BUG FIX Added missing 'import os' (NameError at import time) Added missing _looks_like_ollama_endpoint() method - config.yaml: proxy section, URLs point to localhost:8765 - .env.example: all proxy-related env vars documented - README.md: architecture diagram, quick-start guide Co-authored-by: Zeeeepa <zeeeepa@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a local proxy server that auto-launches when eversale starts and routes all LLM calls to your configured backend. This eliminates the need to modify eversale's core logic for different LLM providers.
Architecture
New Files
proxy_server.pyproxy_launcher.pyproxy_config.pyProxy Endpoints
/v1/chat/completions/v1/models/api/chat/healthBug Fixes (PR #213)
llm_fallback_chain.py: Added missingimport os(was causingNameErrorat import time)llm_fallback_chain.py: Added missing_looks_like_ollama_endpoint()method (was causingAttributeErrorat runtime)Modified Files
gpu_llm_client.py: Default URL →localhost:8765, added_is_local_proxy(), proxy handles authconfig.yaml: Added proxy section, URLs point to proxy.env.example: Full documentation of proxy env varsREADME.md: Architecture diagram, quick-start for all 4 backendsQuick Start
Based on PR #213 (
codegen-bot/eversale-local-api-integration-a7f3c2).💻 View my work • 👤 Initiated by @Zeeeepa • About Codegen
⛔ Remove Codegen from PR • 🚫 Ban action checks
Summary by cubic
Adds an auto-launching local proxy that routes all LLM calls to your chosen backend (Anthropic/OpenAI/Ollama/custom), removing provider-specific logic from
eversaleand simplifying setup. Default LLM requests now go throughhttp://127.0.0.1:8765with streaming and model mapping handled by the proxy.New Features
proxy_server.py: OpenAI-compatible/v1/chat/completions,/v1/models, and Ollama-compatible/api/chatwith SSE streaming; adapters for Anthropic, OpenAI, Ollama, and custom.proxy_launcher.py: idempotent background start, health-check gating, PID management; binds to127.0.0.1:8765.gpu_llm_client.py: defaults to the local proxy; auth handled by proxy; supports proxy detection.proxy_config.py,config.yaml,.env.example, andREADME.md: backend selection viaLLM_BACKEND, model mapping, and quick start.Bug Fixes
llm_fallback_chain.py: added missingimport osand_looks_like_ollama_endpoint()to fix NameError/AttributeError.Written for commit bcd8193. Summary will update on new commits.