-
Notifications
You must be signed in to change notification settings - Fork 79
LCORE-780: REST API endpoint to return info about selected RAG #810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,7 +13,7 @@ | |
| from client import AsyncLlamaStackClientHolder | ||
| from configuration import configuration | ||
| from models.config import Action | ||
| from models.responses import RAGListResponse | ||
| from models.responses import RAGListResponse, RAGInfoResponse | ||
| from utils.endpoints import check_configuration_loaded | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
@@ -31,6 +31,15 @@ | |
| 500: {"description": "Connection to Llama Stack is broken"}, | ||
| } | ||
|
|
||
| rag_responses: dict[int | str, dict[str, Any]] = { | ||
| 200: {}, | ||
| 404: {"response": "RAG with given id not found"}, | ||
| 500: { | ||
| "response": "Unable to retrieve list of RAGs", | ||
| "cause": "Connection to Llama Stack is broken", | ||
| }, | ||
| } | ||
|
|
||
|
|
||
| @router.get("/rags", responses=rags_responses) | ||
| @authorize(Action.LIST_RAGS) | ||
|
|
@@ -94,3 +103,72 @@ async def rags_endpoint_handler( | |
| "cause": str(e), | ||
| }, | ||
| ) from e | ||
|
|
||
|
|
||
| @router.get("/rags/{rag_id}", responses=rag_responses) | ||
| @authorize(Action.GET_RAG) | ||
| async def get_rag_endpoint_handler( | ||
| request: Request, | ||
| rag_id: str, | ||
| auth: Annotated[AuthTuple, Depends(get_auth_dependency())], | ||
| ) -> RAGInfoResponse: | ||
| """Retrieve a single RAG by its unique ID. | ||
|
|
||
| Raises: | ||
| HTTPException: | ||
| - 404 if RAG with the given ID is not found, | ||
| - 500 if unable to connect to Llama Stack, | ||
| - 500 for any unexpected retrieval errors. | ||
|
|
||
| Returns: | ||
| RAGInfoResponse: A single RAG's details | ||
| """ | ||
| # Used only by the middleware | ||
| _ = auth | ||
|
|
||
| # Nothing interesting in the request | ||
| _ = request | ||
|
|
||
| check_configuration_loaded(configuration) | ||
|
|
||
| llama_stack_configuration = configuration.llama_stack_configuration | ||
| logger.info("Llama stack config: %s", llama_stack_configuration) | ||
|
|
||
| try: | ||
| # try to get Llama Stack client | ||
| client = AsyncLlamaStackClientHolder().get_client() | ||
| # retrieve info about RAG | ||
| rag_info = await client.vector_stores.retrieve(rag_id) | ||
| return RAGInfoResponse( | ||
| id=rag_info.id, | ||
| name=rag_info.name, | ||
| created_at=rag_info.created_at, | ||
| last_active_at=rag_info.last_active_at, | ||
| expires_at=rag_info.expires_at, | ||
| object=rag_info.object, | ||
| status=rag_info.status, | ||
| usage_bytes=rag_info.usage_bytes, | ||
| ) | ||
|
|
||
| # connection to Llama Stack server | ||
| except HTTPException: | ||
| raise | ||
|
Comment on lines
+154
to
+155
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chainClarify the purpose of catching HTTPException. The code catches and re-raises If this is defensive coding for future changes, consider adding a comment. Otherwise, this block can be removed to simplify the code. Verify what exceptions 🏁 Script executed: #!/bin/bash
# Find the vector_stores.retrieve implementation or documentation
ast-grep --pattern $'class VectorStores {
$$$
retrieve($$$) {
$$$
}
$$$
}'
# Search for exception handling patterns with vector_stores
rg -nP -C5 'vector_stores\.(retrieve|list)' --type=pyLength of output: 14524 Length of output: 1898 Length of output: 1203 Remove the unreachable HTTPException handler at lines 154-155. The 🤖 Prompt for AI Agents |
||
| except APIConnectionError as e: | ||
| logger.error("Unable to connect to Llama Stack: %s", e) | ||
| raise HTTPException( | ||
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | ||
| detail={ | ||
| "response": "Unable to connect to Llama Stack", | ||
| "cause": str(e), | ||
| }, | ||
| ) from e | ||
| # any other exception that can occur during model listing | ||
| except Exception as e: | ||
| logger.error("Unable to retrieve info about RAG: %s", e) | ||
| raise HTTPException( | ||
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | ||
| detail={ | ||
| "response": "Unable to retrieve info about RAG", | ||
| "cause": str(e), | ||
| }, | ||
| ) from e | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
404 response declared but not implemented.
The
rag_responsesdict declares a 404 response for when a RAG is not found, but the endpoint implementation (lines 137-174) doesn't have any code path that would raise a 404 error. All exceptions are currently treated as 500 errors.Verify how the Llama Stack client behaves when retrieving a non-existent RAG:
🏁 Script executed:
Length of output: 13258
🏁 Script executed:
Length of output: 4252
🏁 Script executed:
Length of output: 6471
404 response declared but not implemented.
The
rag_responsesdict declares a 404 response for when a RAG is not found, but the endpoint implementation (lines 136-174) catches all exceptions generically and returns 500 errors. There's no separate handling forNotFoundErrorfrom thellama_stack_clientlibrary.Compare to
conversations.pywhich properly catchesNotFoundErrorand raisesHTTP_404_NOT_FOUND. Theget_rag_endpoint_handlershould:NotFoundErrorfromllama_stack_clientexcept NotFoundErrorhandler before the genericExceptionhandlerHTTPException(status_code=status.HTTP_404_NOT_FOUND, ...)with the appropriate detailCurrently, if
vector_stores.retrieve(rag_id)raisesNotFoundErrorfor a non-existent RAG, it will be caught by the genericexcept Exceptionblock and returned as a 500 error, not 404.🤖 Prompt for AI Agents