LCORE-780: REST API endpoint to return info about selected RAG#810
Conversation
|
Warning Rate limit exceeded@tisnik has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 47 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (4)
WalkthroughA new GET endpoint Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Endpoint as GET /rags/{rag_id}
participant Auth as Auth Dependency
participant Config as Configuration Loader
participant Llama as Llama Stack Client
participant Response as RAGInfoResponse
Client->>Endpoint: GET /rags/{rag_id} + auth
Endpoint->>Auth: validate (Depends)
Auth-->>Endpoint: AuthTuple
Endpoint->>Config: load()
Config-->>Endpoint: config
Endpoint->>Llama: init client (config)
Endpoint->>Llama: vector_stores.retrieve(rag_id)
alt success
Llama-->>Endpoint: rag_info
Endpoint->>Response: map rag_info -> RAGInfoResponse
Response-->>Client: 200 OK + body
else APIConnectionError
Llama-->>Endpoint: APIConnectionError
Endpoint-->>Client: 500 Internal Server Error (detailed)
else other exception
Llama-->>Endpoint: Exception
Endpoint-->>Client: 500 Internal Server Error (error + cause)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
src/app/endpoints/rags.py (1)
132-135: Consider extracting common configuration loading logic.The configuration loading and validation pattern (lines 132-135) is duplicated from
rags_endpoint_handler(lines 70-74). Similarly, the error handling patterns for Llama Stack connection (lines 156-164) duplicate lines 87-95.Consider extracting this into a helper function to reduce duplication:
async def get_llama_stack_client() -> AsyncLlamaStackClient: """Get initialized Llama Stack client with configuration validation.""" check_configuration_loaded(configuration) llama_stack_configuration = configuration.llama_stack_configuration logger.info("Llama stack config: %s", llama_stack_configuration) return AsyncLlamaStackClientHolder().get_client()Then both endpoints can simply call:
try: client = await get_llama_stack_client() # ... endpoint-specific logic except APIConnectionError as e: # ... common error handling
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/app/endpoints/rags.py(3 hunks)src/models/responses.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/app/endpoints/rags.py (6)
src/models/responses.py (2)
RAGListResponse(133-144)RAGInfoResponse(90-130)src/authorization/middleware.py (1)
authorize(111-122)src/models/config.py (1)
Action(329-377)src/utils/endpoints.py (1)
check_configuration_loaded(112-124)src/configuration.py (2)
configuration(73-77)llama_stack_configuration(87-91)src/client.py (2)
AsyncLlamaStackClientHolder(18-55)get_client(49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-pr
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: e2e_tests (ci)
- GitHub Check: e2e_tests (azure)
| rag_responses: dict[int | str, dict[str, Any]] = { | ||
| 200: {}, | ||
| 404: {"response": "RAG with given id not found"}, | ||
| 500: { | ||
| "response": "Unable to retrieve list of RAGs", | ||
| "cause": "Connection to Llama Stack is broken", | ||
| }, | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
404 response declared but not implemented.
The rag_responses dict declares a 404 response for when a RAG is not found, but the endpoint implementation (lines 137-174) doesn't have any code path that would raise a 404 error. All exceptions are currently treated as 500 errors.
Verify how the Llama Stack client behaves when retrieving a non-existent RAG:
🏁 Script executed:
#!/bin/bash
# Search for how vector_stores.retrieve handles not-found cases
rg -nP -C5 'vector_stores\.retrieve' --type=py
# Look for exception types from llama_stack_client
rg -nP -C3 'class.*Exception|class.*Error' --type=py -g 'llama_stack*'
# Check if there are existing patterns for handling 404s
rg -nP -C3 'HTTP_404_NOT_FOUND|status\.HTTP_404' --type=pyLength of output: 13258
🏁 Script executed:
#!/bin/bash
# Get full context of get_rag_endpoint_handler function
rg -nP -A40 'async def get_rag_endpoint_handler' --type=py src/app/endpoints/rags.py
# Look for any except clauses in the handler
rg -nP -A5 'except.*:' --type=py src/app/endpoints/rags.py | head -50
# Search for llama_stack imports and exceptions
rg -nP 'from.*llama_stack|import.*llama_stack' --type=py | head -20
# Check for NotFoundError or similar exceptions from llama_stack_client
rg -nP 'NotFound|not.found' --type=py -i | grep -i 'exception\|error\|raise' | head -20Length of output: 4252
🏁 Script executed:
#!/bin/bash
# Find rags.py file
fd -t f 'rags\.py' --type f
# Once found, get full endpoint handler
fd -t f 'rags\.py' -x cat {} | head -200Length of output: 6471
404 response declared but not implemented.
The rag_responses dict declares a 404 response for when a RAG is not found, but the endpoint implementation (lines 136-174) catches all exceptions generically and returns 500 errors. There's no separate handling for NotFoundError from the llama_stack_client library.
Compare to conversations.py which properly catches NotFoundError and raises HTTP_404_NOT_FOUND. The get_rag_endpoint_handler should:
- Import
NotFoundErrorfromllama_stack_client - Add an explicit
except NotFoundErrorhandler before the genericExceptionhandler - Raise
HTTPException(status_code=status.HTTP_404_NOT_FOUND, ...)with the appropriate detail
Currently, if vector_stores.retrieve(rag_id) raises NotFoundError for a non-existent RAG, it will be caught by the generic except Exception block and returned as a 500 error, not 404.
🤖 Prompt for AI Agents
In src/app/endpoints/rags.py around lines 34 and 136-174, the rag_responses dict
declares a 404 but the handler catches all exceptions and returns 500; import
NotFoundError from llama_stack_client, add an explicit except NotFoundError
block before the generic Exception handler in get_rag_endpoint_handler, and
inside that block raise HTTPException(status_code=status.HTTP_404_NOT_FOUND,
detail="RAG with given id not found") (or similar message consistent with
rag_responses) so missing RAGs return 404 instead of 500.
| except HTTPException: | ||
| raise |
There was a problem hiding this comment.
🧩 Analysis chain
Clarify the purpose of catching HTTPException.
The code catches and re-raises HTTPException, but it's unclear what would raise it within the try block. The check_configuration_loaded() call at line 132 is outside the try block, and client.vector_stores.retrieve() likely raises llama_stack_client exceptions.
If this is defensive coding for future changes, consider adding a comment. Otherwise, this block can be removed to simplify the code.
Verify what exceptions vector_stores.retrieve() can raise:
🏁 Script executed:
#!/bin/bash
# Find the vector_stores.retrieve implementation or documentation
ast-grep --pattern $'class VectorStores {
$$$
retrieve($$$) {
$$$
}
$$$
}'
# Search for exception handling patterns with vector_stores
rg -nP -C5 'vector_stores\.(retrieve|list)' --type=pyLength of output: 14524
---
🏁 Script executed:
```shell
#!/bin/bash
# Get full context of the get_rag_info() function including all exception handlers
sed -n '130,160p' src/app/endpoints/rags.py
# Also get the full list_rags() function for comparison
sed -n '70,90p' src/app/endpoints/rags.py
Length of output: 1898
---
🏁 Script executed:
```shell
#!/bin/bash
# Find AsyncLlamaStackClientHolder implementation
rg -n "class AsyncLlamaStackClientHolder" --type=py -A 10
# Also search for .get_client() method
rg -n "def get_client" --type=py -B 2 -A 5
Length of output: 1203
Remove the unreachable HTTPException handler at lines 154-155.
The except HTTPException: raise block is dead code. The try block only calls get_client() (raises RuntimeError) and vector_stores.retrieve() (raises APIConnectionError), neither of which produces HTTPException. The HTTPException is only created within the APIConnectionError handler itself, which is outside the try block scope. The list_rags() function confirms this pattern works without the HTTPException handler.
🤖 Prompt for AI Agents
In src/app/endpoints/rags.py around lines 154-155, remove the unreachable
"except HTTPException: raise" handler; the try block cannot raise HTTPException
so delete those two lines and adjust surrounding indentation/flow accordingly so
the RuntimeError and APIConnectionError handlers remain intact and behavior is
unchanged.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/unit/app/endpoints/test_rags.py (1)
176-197: Align test name and description with the generic “unable to retrieve info about RAG” path.This test is exercising the catch‑all
except Exceptionbranch, which returns HTTP 500 with"Unable to retrieve info about RAG"in the response. The current function name still says_unable_to_retrieve_list, and the docstring mentions a connection failure, which can be confusing for the single‑RAG endpoint.Consider renaming and adjusting the docstring:
-@pytest.mark.asyncio -async def test_rag_info_endpoint_unable_to_retrieve_list(mocker: MockerFixture) -> None: - """Test that /rags/{rag_id} endpoint raises HTTP 500 if Llama Stack connection fails.""" +@pytest.mark.asyncio +async def test_rag_info_endpoint_unable_to_retrieve_info(mocker: MockerFixture) -> None: + """Test that /rags/{rag_id} endpoint raises HTTP 500 on unexpected retrieval errors."""The assertions below already match the handler’s actual behavior, so only naming/documentation needs tightening.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/unit/app/endpoints/test_rags.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/unit/app/endpoints/test_rags.py (2)
src/app/endpoints/rags.py (1)
get_rag_endpoint_handler(110-174)src/client.py (1)
get_client(49-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-pr
- GitHub Check: e2e_tests (azure)
- GitHub Check: e2e_tests (ci)
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (2)
tests/unit/app/endpoints/test_rags.py (2)
10-13: Import ofget_rag_endpoint_handleris correct and consistent.The new import cleanly wires the tests to the single-RAG handler and keeps them colocated with existing
/ragsendpoint tests. No issues here.
115-239: Solid coverage of all major/rags/{rag_id}handler branches.These tests nicely exercise configuration-not-loaded, 404 passthrough, connection error, generic failure, and the successful retrieval case, including field-by-field assertions on the
RAGInfoResponse. This should give good regression coverage forget_rag_endpoint_handler.
Description
LCORE-780: REST API endpoint to return info about selected RAG
Type of change
Related Tickets & Documents
Summary by CodeRabbit
New Features
Tests