LCORE-1285: update llama stack to 0.5.2#1112
LCORE-1285: update llama stack to 0.5.2#1112jrobertboos wants to merge 127 commits intolightspeed-core:mainfrom
Conversation
…api; adjust constants and tests accordingly
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (4)
📒 Files selected for processing (134)
WalkthroughIntroduces RAG strategy support (BYOK and OKP), upgrades llama-stack to 0.5.2, adds Jinja2 prompt templating and caching, changes vector-search and RAG context construction (build_rag_context), updates configuration models to Rag/Okp, expands MCP auth probing, and broad test/doc additions and updates. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Lightspeed as Lightspeed Core
participant Llama as Llama Stack
participant BYOK as BYOK Vector Stores
participant OKP as OKP Provider
participant MCP as MCP Server
Client->>Lightspeed: POST /query (question, optional rag/tool ids, headers)
Lightspeed->>Lightspeed: build inline_rag_context (concurrently)
Lightspeed->>BYOK: fetch BYOK chunks (if BYOK inline)
Lightspeed->>OKP: fetch OKP chunks (if OKP inline)
BYOK-->>Lightspeed: BYOK rag_chunks + referenced_documents
OKP-->>Lightspeed: OKP rag_chunks + referenced_documents
Lightspeed->>Lightspeed: merge rag_chunks, apply score_multiplier, build context_text
Lightspeed->>Llama: prepare ResponsesApiParams (context_text / tools / MCP headers)
Lightspeed->>MCP: check_mcp_auth (if MCP tools present)
Llama-->>Lightspeed: Responses API stream / final response
Lightspeed->>Lightspeed: run moderation using moderation_input
Lightspeed-->>Client: aggregated response with referenced_documents and request_id
Notes: rectangles denote components; BYOK and OKP fetches run in parallel; MCP auth is checked before including MCP tools. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@pyproject.toml`:
- Around line 31-33: Update the mismatched dependency for llama-stack-api:
replace "llama-stack-api==0.5.0" with the latest published version
"llama-stack-api==0.4.3" (or align all three to a consistent, released version)
so installations won't fail; locate the dependency entry for llama-stack-api in
the pyproject.toml dependency list and change the version string accordingly.
…points to verify 401 responses with WWW-Authenticate when MCP OAuth is required. Add mock fixtures for Llama Stack client interactions in each test file.
…es out in query, streaming_query, and tools endpoints. Each test verifies that a 401 status is returned without a WWW-Authenticate header upon timeout.
We do not want to accept unbounded amounts of input. Use base 2 numbers because they are cool and nerdy.
The project configures pytest asyncio mode to auto so it is unnecessary to marke individual tests as async.
Need to call extract_token_usage() in order to increment the metrics counter
Get the provider and model in order to pass that to _record_inference_failure. Add model and provider labels to the Counter.
Signed-off-by: red-hat-konflux-kflux-prd-rh02 <190377777+red-hat-konflux-kflux-prd-rh02[bot]@users.noreply.github.com>
Set maximum length for fields
Properly increment metrics for /v1/infer
[LCORE-1393] chore: add agents.md and symlinked claude.md
Signed-off-by: Major Hayden <major@redhat.com>
Signed-off-by: Major Hayden <major@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pyproject.toml`:
- Around line 31-33: The pyproject dependency list pins "llama-stack-api==0.5.2"
which does not exist on PyPI; update the package spec to a valid version (e.g.,
change "llama-stack-api==0.5.2" to "llama-stack-api==0.5.1") and decide whether
to align "llama-stack==0.5.2" and "llama-stack-client==0.5.2" to 0.5.1 for
compatibility—ensure all three package entries ("llama-stack",
"llama-stack-client", "llama-stack-api") reference existing matching versions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3c97ce73-2d75-4d10-94cb-06ad82568e92
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
pyproject.tomlsrc/constants.pytests/e2e/features/info.feature
🚧 Files skipped from review as they are similar to previous changes (1)
- src/constants.py
| "llama-stack==0.5.2", | ||
| "llama-stack-client==0.5.2", | ||
| "llama-stack-api==0.5.2", |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
llama-stack-api 0.5.2 PyPI package availability
💡 Result:
As of March 6, 2026, llama-stack-api==0.5.2 is not available on PyPI. The PyPI release history for llama-stack-api shows the latest published version as 0.5.1 (Feb 19, 2026), with earlier releases like 0.5.0, 0.4.5, etc., and no 0.5.2 entry. [1]
What you can do
- Install the latest available:
pip install llama-stack-api==0.5.1(or omit the version to get latest). [1] - If you specifically need 0.5.2, it likely hasn’t been published to PyPI yet (you’d need to install from source/tag instead). [1]
Sources
[1] PyPI project page / release history for llama-stack-api (shows latest as 0.5.1, no 0.5.2) (pypi.org)
Citations:
Fix version pinning: llama-stack-api==0.5.2 does not exist on PyPI.
llama-stack-api==0.5.2 is not available on PyPI. The latest published version is 0.5.1 (Feb 19, 2026). Update line 33 to use llama-stack-api==0.5.1 or check if the other llama-stack packages (llama-stack and llama-stack-client) should also be downgraded to 0.5.1 for compatibility.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pyproject.toml` around lines 31 - 33, The pyproject dependency list pins
"llama-stack-api==0.5.2" which does not exist on PyPI; update the package spec
to a valid version (e.g., change "llama-stack-api==0.5.2" to
"llama-stack-api==0.5.1") and decide whether to align "llama-stack==0.5.2" and
"llama-stack-client==0.5.2" to 0.5.1 for compatibility—ensure all three package
entries ("llama-stack", "llama-stack-client", "llama-stack-api") reference
existing matching versions.
Signed-off-by: red-hat-konflux-kflux-prd-rh02 <190377777+red-hat-konflux-kflux-prd-rh02[bot]@users.noreply.github.com>
…d-dependencies LCORE-1326: Updated dependencies
…new-linter-errors LCORE-1430: fixed new linter errors
…new-linter-errors LCORE-1431: fixed new linter errors
…references/main chore(deps): update konflux references
LCORE-1420: Fixing MCP Authorization
[RHIDP-12426] Fix Regression on /v2/conversations for referenced_documents caching
asimurka
left a comment
There was a problem hiding this comment.
Please squash the commits into a single one and rebase on the latest changes.
…ing-for-0.4.2-release LCORE-1215: Preparing for 0.4.2 release
…linter-issue LCORE- 1433: fixed linter issue + enable new linter rule
…linter-issue LCORE-1434: fixed linter issue
…api; adjust constants and tests accordingly Updated `test.containerfile` to rhoai-3.4 Update base image in test.containerfile to use upstream Red Hat UBI fixed type error addressed comments - updated from 0.5.0 -> 0.5.2 fixed mypy
Description
Updated Llama Stack to 0.5.0 in order to enable the network configuration on providers so that TLS and Proxy support can be added.
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit