LCORE-780: REST API endpoint to return info about selected RAG by tisnik · Pull Request #810 · lightspeed-core/lightspeed-stack

tisnik · 2025-11-19T08:03:50Z

Description

LCORE-780: REST API endpoint to return info about selected RAG

Type of change

Related Tickets & Documents

Related Issue #LCORE-780

Summary by CodeRabbit

New Features
- Added endpoint to retrieve detailed info for individual RAG instances by ID (id, name, creation/last-active times, usage, expiry, status) with authorization and improved error handling.
- Introduced a standardized response schema for RAG details and enhanced retrieval logging.
Tests
- Added unit tests covering successful retrieval and multiple error scenarios (not-found, connection errors, configuration issues).

coderabbitai · 2025-11-19T08:04:11Z

Warning

Rate limit exceeded

@tisnik has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 47 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between c085b10 and c49dca8.

📒 Files selected for processing (4)

docs/openapi.json (2 hunks)
docs/openapi.md (2 hunks)
docs/output.md (2 hunks)
tests/unit/app/endpoints/test_rags.py (2 hunks)

Walkthrough

A new GET endpoint /rags/{rag_id} was added to retrieve a single RAG's details with authorization. A RAGInfoResponse model was introduced. The handler initializes configuration and a Llama Stack client, calls vector_stores.retrieve(rag_id), maps result to RAGInfoResponse, and implements specific error handling and logging.

Changes

Cohort / File(s)	Summary
Response Model Definition `src/models/responses.py`	Added `RAGInfoResponse` model with fields: `id`, `name`, `created_at`, `last_active_at`, `usage_bytes`, `expires_at`, `object`, `status`.
RAG Retrieval Endpoint `src/app/endpoints/rags.py`	Added `get_rag_endpoint_handler` (GET `/rags/{rag_id}`) and `rag_responses` OpenAPI variants. Handler loads config, initializes Llama Stack client, calls `vector_stores.retrieve(rag_id)`, maps result to `RAGInfoResponse`, logs activity, and handles `HTTPException`, `APIConnectionError`, and general exceptions (returning 500 with details).
Tests `tests/unit/app/endpoints/test_rags.py`	Added async tests for the new handler: config-not-loaded, 404 propagation, `APIConnectionError` with detailed response handling, and successful retrieval asserting `RAGInfoResponse` fields.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Endpoint as GET /rags/{rag_id}
    participant Auth as Auth Dependency
    participant Config as Configuration Loader
    participant Llama as Llama Stack Client
    participant Response as RAGInfoResponse

    Client->>Endpoint: GET /rags/{rag_id} + auth
    Endpoint->>Auth: validate (Depends)
    Auth-->>Endpoint: AuthTuple
    Endpoint->>Config: load()
    Config-->>Endpoint: config
    Endpoint->>Llama: init client (config)
    Endpoint->>Llama: vector_stores.retrieve(rag_id)
    alt success
        Llama-->>Endpoint: rag_info
        Endpoint->>Response: map rag_info -> RAGInfoResponse
        Response-->>Client: 200 OK + body
    else APIConnectionError
        Llama-->>Endpoint: APIConnectionError
        Endpoint-->>Client: 500 Internal Server Error (detailed)
    else other exception
        Llama-->>Endpoint: Exception
        Endpoint-->>Client: 500 Internal Server Error (error + cause)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Verify RAGInfoResponse field mapping matches actual vector_stores.retrieve response.
Confirm APIConnectionError handling and returned payload shape follow project conventions.
Check auth dependency injection and rag_responses OpenAPI definitions.
Review logging for potential sensitive data exposure.

Possibly related PRs

LCORE-780: REST API endpoint to list all configured RAGs #805 — Adds the list-RAG endpoint (GET /rags) and touches the same src/app/endpoints/rags.py module and vector_stores usage; likely related.

Suggested reviewers

asamal4
VladimirKadlec
are-ces

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: adding a REST API endpoint to retrieve information about a selected RAG, which aligns with the changeset additions in rags.py, responses.py, and tests.
Docstring Coverage	✅ Passed	Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

src/app/endpoints/rags.py (1)
132-135: Consider extracting common configuration loading logic.

The configuration loading and validation pattern (lines 132-135) is duplicated from rags_endpoint_handler (lines 70-74). Similarly, the error handling patterns for Llama Stack connection (lines 156-164) duplicate lines 87-95.

Consider extracting this into a helper function to reduce duplication:
async def get_llama_stack_client() -> AsyncLlamaStackClient:
    """Get initialized Llama Stack client with configuration validation."""
    check_configuration_loaded(configuration)
    llama_stack_configuration = configuration.llama_stack_configuration
    logger.info("Llama stack config: %s", llama_stack_configuration)
    return AsyncLlamaStackClientHolder().get_client()
Then both endpoints can simply call:
try:
    client = await get_llama_stack_client()
    # ... endpoint-specific logic
except APIConnectionError as e:
    # ... common error handling

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7042ac9 and 07c482d.

📒 Files selected for processing (2)

src/app/endpoints/rags.py (3 hunks)
src/models/responses.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/app/endpoints/rags.py (6)

src/models/responses.py (2)

RAGListResponse (133-144)

RAGInfoResponse (90-130)

src/authorization/middleware.py (1)

authorize (111-122)

src/models/config.py (1)

Action (329-377)

src/utils/endpoints.py (1)

check_configuration_loaded (112-124)

src/configuration.py (2)

configuration (73-77)

llama_stack_configuration (87-91)

src/client.py (2)

AsyncLlamaStackClientHolder (18-55)

get_client (49-55)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: build-pr
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: e2e_tests (ci)
GitHub Check: e2e_tests (azure)

coderabbitai · 2025-11-19T08:07:43Z

src/app/endpoints/rags.py

+rag_responses: dict[int | str, dict[str, Any]] = {
+    200: {},
+    404: {"response": "RAG with given id not found"},
+    500: {
+        "response": "Unable to retrieve list of RAGs",
+        "cause": "Connection to Llama Stack is broken",
+    },
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

404 response declared but not implemented.

The rag_responses dict declares a 404 response for when a RAG is not found, but the endpoint implementation (lines 137-174) doesn't have any code path that would raise a 404 error. All exceptions are currently treated as 500 errors.

Verify how the Llama Stack client behaves when retrieving a non-existent RAG:

🏁 Script executed:

#!/bin/bash # Search for how vector_stores.retrieve handles not-found cases rg -nP -C5 'vector_stores\.retrieve' --type=py # Look for exception types from llama_stack_client rg -nP -C3 'class.*Exception|class.*Error' --type=py -g 'llama_stack*' # Check if there are existing patterns for handling 404s rg -nP -C3 'HTTP_404_NOT_FOUND|status\.HTTP_404' --type=py

Length of output: 13258

🏁 Script executed:

#!/bin/bash # Get full context of get_rag_endpoint_handler function rg -nP -A40 'async def get_rag_endpoint_handler' --type=py src/app/endpoints/rags.py # Look for any except clauses in the handler rg -nP -A5 'except.*:' --type=py src/app/endpoints/rags.py | head -50 # Search for llama_stack imports and exceptions rg -nP 'from.*llama_stack|import.*llama_stack' --type=py | head -20 # Check for NotFoundError or similar exceptions from llama_stack_client rg -nP 'NotFound|not.found' --type=py -i | grep -i 'exception\|error\|raise' | head -20

Length of output: 4252

🏁 Script executed:

#!/bin/bash # Find rags.py file fd -t f 'rags\.py' --type f # Once found, get full endpoint handler fd -t f 'rags\.py' -x cat {} | head -200

Length of output: 6471

404 response declared but not implemented.

The rag_responses dict declares a 404 response for when a RAG is not found, but the endpoint implementation (lines 136-174) catches all exceptions generically and returns 500 errors. There's no separate handling for NotFoundError from the llama_stack_client library.

Compare to conversations.py which properly catches NotFoundError and raises HTTP_404_NOT_FOUND. The get_rag_endpoint_handler should:

Import NotFoundError from llama_stack_client

Add an explicit except NotFoundError handler before the generic Exception handler

Raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, ...) with the appropriate detail

Currently, if vector_stores.retrieve(rag_id) raises NotFoundError for a non-existent RAG, it will be caught by the generic except Exception block and returned as a 500 error, not 404.

🤖 Prompt for AI Agents

In src/app/endpoints/rags.py around lines 34 and 136-174, the rag_responses dict declares a 404 but the handler catches all exceptions and returns 500; import NotFoundError from llama_stack_client, add an explicit except NotFoundError block before the generic Exception handler in get_rag_endpoint_handler, and inside that block raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="RAG with given id not found") (or similar message consistent with rag_responses) so missing RAGs return 404 instead of 500.

src/app/endpoints/rags.py

coderabbitai · 2025-11-19T08:07:43Z

src/app/endpoints/rags.py

+    except HTTPException:
+        raise


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Clarify the purpose of catching HTTPException.

The code catches and re-raises HTTPException, but it's unclear what would raise it within the try block. The check_configuration_loaded() call at line 132 is outside the try block, and client.vector_stores.retrieve() likely raises llama_stack_client exceptions.

If this is defensive coding for future changes, consider adding a comment. Otherwise, this block can be removed to simplify the code.

Verify what exceptions vector_stores.retrieve() can raise:

🏁 Script executed:

#!/bin/bash # Find the vector_stores.retrieve implementation or documentation ast-grep --pattern $'class VectorStores { $$$ retrieve($$$) { $$$ } $$$ }' # Search for exception handling patterns with vector_stores rg -nP -C5 'vector_stores\.(retrieve|list)' --type=py

Length of output: 14524

--- 🏁 Script executed: ```shell #!/bin/bash # Get full context of the get_rag_info() function including all exception handlers sed -n '130,160p' src/app/endpoints/rags.py # Also get the full list_rags() function for comparison sed -n '70,90p' src/app/endpoints/rags.py

Length of output: 1898

--- 🏁 Script executed: ```shell #!/bin/bash # Find AsyncLlamaStackClientHolder implementation rg -n "class AsyncLlamaStackClientHolder" --type=py -A 10 # Also search for .get_client() method rg -n "def get_client" --type=py -B 2 -A 5

Length of output: 1203

Remove the unreachable HTTPException handler at lines 154-155.

The except HTTPException: raise block is dead code. The try block only calls get_client() (raises RuntimeError) and vector_stores.retrieve() (raises APIConnectionError), neither of which produces HTTPException. The HTTPException is only created within the APIConnectionError handler itself, which is outside the try block scope. The list_rags() function confirms this pattern works without the HTTPException handler.

🤖 Prompt for AI Agents

In src/app/endpoints/rags.py around lines 154-155, remove the unreachable "except HTTPException: raise" handler; the try block cannot raise HTTPException so delete those two lines and adjust surrounding indentation/flow accordingly so the RuntimeError and APIConnectionError handlers remain intact and behavior is unchanged.

src/models/responses.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/unit/app/endpoints/test_rags.py (1)
176-197: Align test name and description with the generic “unable to retrieve info about RAG” path.

This test is exercising the catch‑all except Exception branch, which returns HTTP 500 with "Unable to retrieve info about RAG" in the response. The current function name still says _unable_to_retrieve_list, and the docstring mentions a connection failure, which can be confusing for the single‑RAG endpoint.

Consider renaming and adjusting the docstring:
-@pytest.mark.asyncio
-async def test_rag_info_endpoint_unable_to_retrieve_list(mocker: MockerFixture) -> None:
-    """Test that /rags/{rag_id} endpoint raises HTTP 500 if Llama Stack connection fails."""
+@pytest.mark.asyncio
+async def test_rag_info_endpoint_unable_to_retrieve_info(mocker: MockerFixture) -> None:
+    """Test that /rags/{rag_id} endpoint raises HTTP 500 on unexpected retrieval errors."""
The assertions below already match the handler’s actual behavior, so only naming/documentation needs tightening.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07c482d and 37d8c0c.

📒 Files selected for processing (1)

tests/unit/app/endpoints/test_rags.py (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

tests/unit/app/endpoints/test_rags.py (2)

src/app/endpoints/rags.py (1)

get_rag_endpoint_handler (110-174)

src/client.py (1)

get_client (49-55)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: build-pr
GitHub Check: e2e_tests (azure)
GitHub Check: e2e_tests (ci)
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

🔇 Additional comments (2)

tests/unit/app/endpoints/test_rags.py (2)

10-13: Import of get_rag_endpoint_handler is correct and consistent.

The new import cleanly wires the tests to the single-RAG handler and keeps them colocated with existing /rags endpoint tests. No issues here.

115-239: Solid coverage of all major /rags/{rag_id} handler branches.

These tests nicely exercise configuration-not-loaded, 404 passthrough, connection error, generic failure, and the successful retrieval case, including field-by-field assertions on the RAGInfoResponse. This should give good regression coverage for get_rag_endpoint_handler.

tests/unit/app/endpoints/test_rags.py

LCORE-780: REST API endpoint to return info about selected RAG

07c482d

coderabbitai bot reviewed Nov 19, 2025

View reviewed changes

New unit tests for the new endpoint

37d8c0c

coderabbitai bot reviewed Nov 19, 2025

View reviewed changes

tests/unit/app/endpoints/test_rags.py Outdated Show resolved Hide resolved

tisnik added 2 commits November 19, 2025 09:19

Fixes in model

c085b10

Updated documentation accordingly

c49dca8

tisnik merged commit 6bc8850 into lightspeed-core:main Nov 19, 2025
21 of 23 checks passed

coderabbitai bot mentioned this pull request Feb 11, 2026

LCORE-1271: Use user-facing index names from LCORE config instead of llama-stack IDs #1135

Merged

19 tasks

coderabbitai bot mentioned this pull request Feb 19, 2026

LCORE-1346: minor doc update #1184

Merged

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LCORE-780: REST API endpoint to return info about selected RAG#810

LCORE-780: REST API endpoint to return info about selected RAG#810
tisnik merged 4 commits intolightspeed-core:mainfrom
tisnik:lcore-780-info-about-selected-rag

tisnik commented Nov 19, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 19, 2025 •

edited

Loading

Rate limit exceeded

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 19, 2025

Uh oh!

Uh oh!

coderabbitai bot Nov 19, 2025

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tisnik commented Nov 19, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Related Tickets & Documents

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tisnik commented Nov 19, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 19, 2025 •

edited

Loading