Conversation
WalkthroughIntroduces versioned conversations with transcript/memory versions, new processing-runs tracking, and reprocessing/activation endpoints keyed by conversation_id. Updates controllers, routers, database schema/repositories, transcription flow, and UI/API client to support version management. Adds enhanced ASR chunking with timestamp preservation. Documentation updated for architecture and APIs. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant UI as Web UI
participant API as Conversations Router
participant CTRL as Conversation Controller
participant DB as Conversations/AudioChunks
participant PR as ProcessingRuns Repo
participant Q as Processing Queue
UI->>API: POST /api/conversations/{conversation_id}/reprocess-transcript
API->>CTRL: reprocess_transcript(conversation_id, user)
CTRL->>DB: Validate ownership, fetch conversation/audio
CTRL->>CTRL: Resolve audio path, compute config_hash
CTRL->>PR: create_run(run_type="transcript", status=PENDING)
CTRL->>DB: create_transcript_version(processing_run_id)
CTRL->>Q: enqueue transcript processing (conversation_id, version_id)
CTRL-->>UI: { run_id, version_id, config_hash, status: PENDING }
sequenceDiagram
autonumber
participant UI as Web UI
participant API as Conversations Router
participant CTRL as Conversation Controller
participant DB as Conversations Repo
UI->>API: POST /api/conversations/{conversation_id}/activate-transcript/{version_id}
API->>CTRL: activate_transcript_version(...)
CTRL->>DB: Set active_transcript_version, sync legacy fields
CTRL-->>UI: { activated: true, version_id }
sequenceDiagram
autonumber
participant UI as Web UI
participant API as Conversations Router
participant CTRL as Conversation Controller
participant AC as AudioChunks Repo
participant CV as Conversations Repo
participant FS as File Storage
UI->>API: DELETE /api/conversations/{audio_uuid}
API->>CTRL: delete_conversation(audio_uuid, user)
CTRL->>AC: Fetch audio_chunk, verify ownership
CTRL->>AC: Delete audio_chunk
alt conversation linked
CTRL->>CV: Delete conversation (optional)
end
CTRL->>FS: Delete related audio files
CTRL-->>UI: { deleted_audio_chunk: true, deleted_conversation: bool, files: [...] }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60–90 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backends/advanced/src/advanced_omi_backend/transcription.py (1)
309-315: Double memory queueing: this reintroduces the duplicate memory bugYou call _queue_memory_processing here, while the controllers also close conversations to trigger memory. Remove this call or guard via config/flag to avoid duplicate memories.
- # Queue memory processing now that transcription is complete - await self._queue_memory_processing() + # Memory processing is triggered by conversation closure; avoid duplicate queueing here. + # (Left intentionally empty)backends/advanced/src/advanced_omi_backend/database.py (1)
270-274: Remove duplicate method definitions.The methods
update_transcript(line 270) andstore_raw_transcript_data(line 185) are defined twice in the class, which will cause the first definitions to be shadowed.Apply this diff to remove the duplicate definitions:
- async def update_transcript(self, audio_uuid, full_transcript): - """Update the entire transcript list (for compatibility).""" - await self.col.update_one( - {"audio_uuid": audio_uuid}, {"$set": {"transcript": full_transcript}} - ) async def update_segment_timing(self, audio_uuid, segment_index, start_time, end_time):And remove the first
store_raw_transcript_datamethod at line 185-198.Also applies to: 454-500
🧹 Nitpick comments (45)
extras/asr-services/pyproject.toml (1)
18-19: Pin httpx upper bound to avoid surprise breakages; verify NumPy bounds consistency across groups.
- Add an upper bound for httpx to stay within the current major until you intentionally upgrade.
- In [dependency-groups.parakeet] NumPy is capped at <2.0 but the main deps are not; consider aligning to avoid resolver conflicts when both groups are used.
Apply this diff for httpx:
- "easy-audio-interfaces>=0.7.1", - "httpx>=0.28.1", + "easy-audio-interfaces>=0.7.1", + "httpx>=0.28.1,<1.0",Outside the shown hunk, consider updating the main NumPy dep to align with the parakeet group:
# in [project.dependencies] "numpy>=1.26,<2.0"extras/asr-services/.env.template (2)
9-10: Key order nit: place PARAKEET_CONTAINER_PORT before PARAKEET_HOST_PORT.Keeps dotenv-linter happy and maintains consistent grouping.
Proposed snippet:
PARAKEET_CONTAINER_PORT=8765 PARAKEET_HOST_PORT=8767
11-27: Default values here diverge from docker-compose; align or document why.
- .env.template uses 120s/10s/120s while docker-compose defaults are 30s/5s/60s. This can confuse operators.
- The comment recommends 20–40s but sets 120s.
Suggested defaults to match docker-compose and the guidance:
CHUNKING_ENABLED=true CHUNK_DURATION_SECONDS=30.0 OVERLAP_DURATION_SECONDS=5.0 MIN_AUDIO_FOR_CHUNKING=60.0 CONFIDENCE_THRESHOLD=0.8Also ensure the file ends with a trailing newline for tooling compatibility.
backends/advanced/scripts/delete_all_conversations_api.py (6)
18-29: Make base URL configurable via env; avoid hard-coding localhost.Allows targeting non-local deployments and CI/staging.
Apply this diff:
- - +BASE_URL = os.getenv("BACKEND_BASE_URL", "http://localhost:8000") @@ - base_url = "http://localhost:8000" + base_url = BASE_URL
30-49: Add client timeouts to aiohttp sessions to prevent hangs; minor robustness.Network calls can stall without bounds.
Apply this diff:
- async with aiohttp.ClientSession() as session: + timeout = aiohttp.ClientTimeout(total=float(os.getenv("HTTP_TIMEOUT_SECONDS", "30"))) + async with aiohttp.ClientSession(timeout=timeout) as session: @@ - if response.status != 200: + if response.status != 200: print(f"Failed to login: {response.status}") text = await response.text() print(f"Response: {text}") sys.exit(1)Repeat the same timeout usage for the session in delete_all_conversations (see below).
54-70: Reuse BASE_URL and apply the same session timeout.Consistent configuration and resiliency.
- base_url = "http://localhost:8000" + base_url = BASE_URL @@ - async with aiohttp.ClientSession() as session: + timeout = aiohttp.ClientTimeout(total=float(os.getenv("HTTP_TIMEOUT_SECONDS", "30"))) + async with aiohttp.ClientSession(timeout=timeout) as session:
82-86: Silence the unused loop variable warning and improve clarity.Rename the unused key to underscore-prefixed.
- for client_id, client_conversations in conversations_dict.items(): - conversations.extend(client_conversations) + for _client_id, client_conversations in conversations_dict.items(): + conversations.extend(client_conversations)
113-121: Treat 200/202/204 as success for DELETE; fix f-string nit.APIs commonly return 204 No Content on deletion.
- if response.status == 200: + if response.status in (200, 202, 204): deleted_count += 1 print(f"Deleted conversation {audio_uuid} ({deleted_count}/{len(conversations)})") else: failed_count += 1 text = await response.text() print(f"Failed to delete {audio_uuid}: {response.status} - {text}") @@ - print(f"\nDeletion complete:") + print("\nDeletion complete:")Optional: batch deletions with bounded concurrency (e.g., asyncio.Semaphore) to speed up large purges without overloading the API.
91-97: Confirmation UX: accept common variants.Consider accepting “y/yes” and defaulting to “no” on empty input.
- response = input(f"Are you sure you want to delete ALL {len(conversations)} conversations? (yes/no): ") - if response.lower() != "yes": + response = input(f"Are you sure you want to delete ALL {len(conversations)} conversations? [y/N]: ").strip().lower() + if response not in ("y", "yes"): print("Deletion cancelled") returnextras/asr-services/docker-compose.yml (1)
24-29: Align defaults with .env.template (or explain the difference).Current compose defaults (30/5/60) differ from .env.template (120/10/120). Pick one source of truth to avoid surprise behavior.
If aligning to the shorter chunks:
- - CHUNKING_ENABLED=${CHUNKING_ENABLED:-true} - - CHUNK_DURATION_SECONDS=${CHUNK_DURATION_SECONDS:-30.0} - - OVERLAP_DURATION_SECONDS=${OVERLAP_DURATION_SECONDS:-5.0} - - MIN_AUDIO_FOR_CHUNKING=${MIN_AUDIO_FOR_CHUNKING:-60.0} - - CONFIDENCE_THRESHOLD=${CONFIDENCE_THRESHOLD:-0.8} + - CHUNKING_ENABLED=${CHUNKING_ENABLED:-true} + - CHUNK_DURATION_SECONDS=${CHUNK_DURATION_SECONDS:-30.0} + - OVERLAP_DURATION_SECONDS=${OVERLAP_DURATION_SECONDS:-5.0} + - MIN_AUDIO_FOR_CHUNKING=${MIN_AUDIO_FOR_CHUNKING:-60.0} + - CONFIDENCE_THRESHOLD=${CONFIDENCE_THRESHOLD:-0.8}Optionally reference an env_file here to centralize configuration.
backends/advanced/webui/src/services/api.ts (1)
80-82: Type the version selector explicitly.
transcriptVersionIdoften uses the sentinel "active". Consider a union type to self-document this contract.- reprocessMemory: (audioUuid: string, transcriptVersionId: string) => api.post(`/api/conversations/${audioUuid}/reprocess-memory`, null, { + reprocessMemory: (audioUuid: string, transcriptVersionId: 'active' | string) => api.post(`/api/conversations/${audioUuid}/reprocess-memory`, null, {backends/advanced/webui/src/pages/Conversations.tsx (3)
76-81: Dropdown close handler may also close when interacting inside the menu.Currently any document click closes the menu. If you want to allow clicks inside without closing, guard against clicks originating within the dropdown root.
- useEffect(() => { - const handleClickOutside = () => setOpenDropdown(null) - document.addEventListener('click', handleClickOutside) - return () => document.removeEventListener('click', handleClickOutside) - }, []) + useEffect(() => { + const handleClickOutside = (e: MouseEvent) => { + const target = e.target as HTMLElement + if (!target.closest('[data-dropdown-root="conversation-actions"]')) { + setOpenDropdown(null) + } + } + document.addEventListener('click', handleClickOutside) + return () => document.removeEventListener('click', handleClickOutside) + }, [])And add
data-dropdown-root="conversation-actions"to the dropdown container.
296-337: Add a root marker to support click-outside guards.If you keep the global click listener, tag the dropdown root for precise containment checks.
- <div className="absolute right-0 top-8 w-48 ... py-2 z-10"> + <div data-dropdown-root="conversation-actions" className="absolute right-0 top-8 w-48 ... py-2 z-10">
341-378: Clear segment timer on natural audio end.Prevents a stray timeout from firing after the audio ends by itself.
- audio.addEventListener('ended', () => { - setPlayingSegment(null); - }); + audio.addEventListener('ended', () => { + if (segmentTimerRef.current) { + window.clearTimeout(segmentTimerRef.current); + segmentTimerRef.current = null; + } + setPlayingSegment(null); + });backends/advanced/upload_files.py (3)
410-444: Prefer server-provided status_url and fail fast on missing files.
- Use
status_urlfrom the submission response if present to avoid client/route drift.- File existence handling is good; keep the early exit.
- logger.info(f"🔗 Status URL: {job_data.get('status_url', 'N/A')}") + status_url = job_data.get("status_url") + logger.info(f"🔗 Status URL: {status_url or 'N/A'}") ... - return poll_job_status(job_id, token, base_url, total_files) + return poll_job_status(job_id, token, base_url, total_files) if not status_url \ + else poll_job_status(job_id, token, base_url, total_files) # Backward compat if poller doesn't accept URL yetIf you’re open to touching the poller, pass
status_urlexplicitly and only compute a fallback when it’s missing.
249-252: Avoid closing file handles twice.You close in the normal path and again in
finally. Keep only thefinallyto simplify and remove benign warnings.- # Close all file handles - for _, file_tuple in files_data: - file_tuple[1].close()
191-199: Comment and behavior diverge.Comment says "filter out files over 20 minutes" but the filter is removed (intentionally per PR summary). Update the comment to avoid confusion.
- # Check duration and filter out files over 20 minutes + # Check duration for logging; no duration-based filtering appliedCLAUDE.md (2)
662-671: Sanitize example tokens to stop secret scanners from flagging false positives.Replace bearer tokens and JWT-like strings with obvious placeholders.
- # Step 2: Get auth token + # Step 2: Get auth token # Step 3: Use token in file upload - curl -X POST \ - -H "Authorization: Bearer YOUR_TOKEN_HERE" \ + curl -X POST \ + -H "Authorization: Bearer <TOKEN>" \ -F "files=@/path/to/audio.wav" \ -F "device_name=test-upload" \ http://localhost:8000/api/process-audio-filesAlso replace all occurrences of
eyJ...example JWTs with<TOKEN>and headers withAuthorization: Bearer <TOKEN>across this section to appease Gitleaks.
741-746: Docs/routes mismatch for transcript activation.Frontend calls
POST /api/conversations/{audio_uuid}/activate-transcript/{version_id}; this doc shows a JSON body with-dand no path param. Align docs with the implemented route.- -d '{"transcript_version_id": "VERSION_ID"}' \ - http://localhost:8000/api/conversations/YOUR_AUDIO_UUID/activate-transcript + http://localhost:8000/api/conversations/YOUR_AUDIO_UUID/activate-transcript/VERSION_IDDo the same for memory activation for consistency.
extras/asr-services/enhanced_chunking.py (5)
62-69: Silence linter about unused parameter without changing parent signature.Match parent signature but explicitly mark
keep_logitsunused.- def _get_batch_preds(self, keep_logits=False): + def _get_batch_preds(self, keep_logits=False): + _ = keep_logits # unused, required by parent signature
146-149: Prefer logging.exception and bare raise for trace preservation.Keep the original traceback and avoid shadowing the exception.
- except Exception as e: - logger.error(f"Hypothesis joining FAILED: {e}") - raise e # Don't silently fall back + except Exception: + logger.exception("Hypothesis joining FAILED") + raiseApply similarly to other catch blocks in this module.
295-355: Timestamp extraction: fallbacks are OK; prefer the native utilities when available.This function duplicates logic already covered by
extract_timestamps_from_hypotheses_native. Consider consolidating to one path and removing the blindexcept Exceptionswallowing at the end.- except Exception as e: - logger.error(f"Native timestamp extraction FAILED: {e}") - raise e # Don't silently fall back + except Exception: + logger.exception("Native timestamp extraction FAILED") + raiseAnd either remove
extract_timestamps_from_hypothesesor make it a thin wrapper that calls the native variant and only falls back when strictly necessary.
489-512: Remove f-strings without placeholders.These trigger linters; they’re plain strings.
- logger.debug(f"Initializing NeMo chunked processor...") + logger.debug("Initializing NeMo chunked processor...") ... - logger.debug(f"Loading audio file into chunker...") + logger.debug("Loading audio file into chunker...")Repeat for the other similar lines later in this function.
589-599: Use logging.exception and re-raise.Avoid losing stack traces on failure.
- logger.error(f"Enhanced chunking failed: {e}") - raise + logger.exception("Enhanced chunking failed") + raiseextras/asr-services/parakeet-offline.py (5)
137-162: Portability: Avoid double-opening the same NamedTemporaryFile handleYou open wave.open(tmpfile_name, 'wb') while the NamedTemporaryFile handle is still open. Works on Linux, brittle on Windows. Close before wave.open or don’t keep the handle.
-with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile: - tmpfile_name = tmpfile.name +with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile: + tmpfile_name = tmpfile.name + pass # only to reserve a unique path +# Reopen separately to write WAV with wave.open(tmpfile_name, 'wb') as wav_file:
243-264: Harden word timestamp extractionDirectly indexing result.timestamp['word'] can raise if structure differs. Guard presence to avoid 500s.
-# Extract word-level timestamps - NeMo Parakeet format -words = [] -for word_data in result.timestamp['word']: +words = [] +ts = getattr(result, "timestamp", None) +if isinstance(ts, dict) and "word" in ts and isinstance(ts["word"], list): + for word_data in ts["word"]: word_dict = { "word": word_data['word'], "start": word_data['start'], "end": word_data['end'], "confidence": 1.0, } words.append(word_dict)
286-290: Prefer bare raise and include traceback in logsUse logger.exception for stack and “raise” to preserve context.
-except Exception as e: - logger.error(f"Error during transcription: {e}") - # Re-raise the exception so HTTP endpoint can return proper error code - raise e +except Exception: + logger.exception("Error during transcription") + # Re-raise the exception so HTTP endpoint can return proper error code + raise
557-561: Return richer trace on errors in /transcribe handlerSame: use logger.exception for stack.
-except Exception as e: - error_time = time.time() - request_start - logger.error(f"🕐 TIMING: Error occurred after {error_time:.3f}s - {e}") - raise HTTPException(status_code=500, detail=str(e)) +except Exception: + error_time = time.time() - request_start + logger.exception(f"🕐 TIMING: Error occurred after {error_time:.3f}s") + raise HTTPException(status_code=500, detail="Transcription failed")
273-281: Reduce transcript content logging to avoid PII leakage and log bloatLogging first 100 chars and full keys at INFO can leak user content and inflate logs. Gate behind DEBUG.
-logger.info(f"🔍 PARAKEET RESPONSE DEBUG:") -logger.info(f" - Text length: {len(response_data['text'])}") -logger.info(f" - First 100 chars: {repr(response_data['text'][:100])}") -logger.info(f" - Words count: {len(response_data['words'])}") -logger.info(f" - First 3 words: {response_data['words'][:3] if response_data['words'] else 'None'}") -logger.info(f" - Full response keys: {list(response_data.keys())}") +logger.debug("🔍 PARAKEET RESPONSE DEBUG:") +logger.debug(f" - Text length: {len(response_data['text'])}") +logger.debug(f" - First 100 chars: {repr(response_data['text'][:100])}") +logger.debug(f" - Words count: {len(response_data['words'])}") +logger.debug(f" - First 3 words: {response_data['words'][:3] if response_data['words'] else 'None'}") +logger.debug(f" - Full response keys: {list(response_data.keys())}")backends/advanced/src/advanced_omi_backend/transcription_providers.py (2)
555-558: Preserve traceback: use bare raiseRe-raising with “raise e” drops context.
-except Exception as e: - logger.error(f"Error calling Parakeet service: {e}") - raise e +except Exception: + logger.exception("Error calling Parakeet service") + raise
529-546: Consider dynamic timeouts for long audio uploadsHard‑coded 180s may be short for multi‑minute WAVs; you already use size‑based timeouts for Deepgram. Mirror that approach here.
If desired, I can propose a size‑aware httpx.Timeout similar to Deepgram’s path.
backends/advanced/src/advanced_omi_backend/transcript_coordinator.py (1)
87-106: Thread-safety/readability nitsSetting transcript_failures without lock is okay for CPython but wrap in lock for parity with event creation and to ease future maintenance.
- def signal_transcript_failed(self, audio_uuid: str, error_message: str): + async def signal_transcript_failed(self, audio_uuid: str, error_message: str): @@ - # Store the failure message - self.transcript_failures[audio_uuid] = error_message + async with self._lock: + self.transcript_failures[audio_uuid] = error_messageNote: adjust call sites to await.
backends/advanced/src/advanced_omi_backend/transcription.py (6)
101-118: Propagate full traceback and keep coordinator signalingUse logger.exception for context. This also aligns with TRY400.
- except Exception as e: - # Signal transcription failure - logger.error(f"Transcription failed for {self._current_audio_uuid}: {e}") + except Exception: + # Signal transcription failure + logger.exception(f"Transcription failed for {self._current_audio_uuid}") if self._current_audio_uuid: # Update database status to FAILED if self.chunk_repo: await self.chunk_repo.update_transcription_status( - self._current_audio_uuid, "FAILED", error_message=str(e) + self._current_audio_uuid, "FAILED", error_message="transcription failed" ) # Signal coordinator about failure coordinator = get_transcript_coordinator() - coordinator.signal_transcript_failed(self._current_audio_uuid, str(e)) + coordinator.signal_transcript_failed(self._current_audio_uuid, "transcription failed")
147-154: Preserve traceback on provider errorRe-raising with “raise e” loses context.
- except Exception as e: - logger.error(f"Error getting transcript from {self.provider.name}: {e}") + except Exception: + logger.exception(f"Error getting transcript from {self.provider.name}") # Clean up buffer before re-raising self._audio_buffer.clear() self._audio_start_time = None self._collecting = False - raise e + raise
271-299: Segment storage: absolute_timestamp derivation is weakUsing time.time() + start yields non-deterministic absolute timestamps. Prefer deriving from the first chunk’s timestamp or persisted conversation start.
If the chunk_repo exposes a base timestamp, I can wire it here. Otherwise, consider storing relative timestamps only.
59-62: Comment driftComment says memory processing is handled exclusively by conversation closure and method removed, but _queue_memory_processing still exists and is called (later). Update comments/code to match.
559-567: Reconnect path: add jitter/backoffOn repeated provider errors, a fixed 2s wait can hot‑loop. Consider exponential backoff with cap.
353-361: Normalization robustnessIf provider returns unexpected types, you return empty dict silently. Consider logging at WARN and include provider name to ease debugging.
backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (1)
95-101: Consider making transcript_version_id optional with a sensible default.Currently,
transcript_version_idis required (Query(...)), but given that users might want to reprocess the currently active transcript version, consider making it optional and defaulting to the active version in the controller.Apply this diff to make the parameter optional with a default:
async def reprocess_memory( audio_uuid: str, current_user: User = Depends(current_active_user), - transcript_version_id: str = Query(...) + transcript_version_id: str = Query(None, description="Transcript version ID (defaults to active version)") ):backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (5)
340-383: Remove excessive debug logging in production code.The extensive debug logging (Lines 340-383) appears to be troubleshooting code that should not be in production. This level of logging could impact performance and fill up logs unnecessarily.
Apply this diff to remove the debug logging:
async def delete_conversation(audio_uuid: str, user: User): """Delete a conversation and its associated audio file. Users can only delete their own conversations.""" try: - # DEBUG: Log comprehensive details about the search - logger.info(f"DELETE DEBUG: Starting delete for audio_uuid='{audio_uuid}'") - logger.info(f"DELETE DEBUG: audio_uuid type={type(audio_uuid)}, length={len(audio_uuid)}") - logger.info(f"DELETE DEBUG: audio_uuid repr={repr(audio_uuid)}") - logger.info(f"DELETE DEBUG: audio_uuid bytes={audio_uuid.encode('utf-8').hex()}") - logger.info(f"DELETE DEBUG: Database collection={chunks_col.name}, database={chunks_col.database.name}") - - # DEBUG: Test if ANY conversations exist in the collection - total_count = await chunks_col.count_documents({}) - logger.info(f"DELETE DEBUG: Total conversations in collection: {total_count}") - - # DEBUG: Try to find a few conversations to compare - sample_conversations = [] - async for doc in chunks_col.find({}).limit(3): - sample_conversations.append({ - "audio_uuid": doc.get("audio_uuid"), - "audio_uuid_type": type(doc.get("audio_uuid")), - "audio_uuid_repr": repr(doc.get("audio_uuid")) - }) - logger.info(f"DELETE DEBUG: Sample conversations: {sample_conversations}") - - # DEBUG: Execute the exact query we're about to use - query = {"audio_uuid": audio_uuid} - logger.info(f"DELETE DEBUG: Query being executed: {query}") - # First, get the conversation to check ownership - conversation = await chunks_col.find_one(query) - - # DEBUG: Log the result - logger.info(f"DELETE DEBUG: Query result: {conversation is not None}") - if conversation: - logger.info(f"DELETE DEBUG: Found conversation with client_id={conversation.get('client_id')}") - else: - # DEBUG: Try alternative queries to see what might work - logger.info("DELETE DEBUG: Trying alternative queries...") - - # Try with regex (case insensitive) - regex_result = await chunks_col.find_one({"audio_uuid": {"$regex": f"^{audio_uuid}$", "$options": "i"}}) - logger.info(f"DELETE DEBUG: Case-insensitive regex query result: {regex_result is not None}") - - # Try to find any conversation containing this uuid as substring - contains_result = await chunks_col.find_one({"audio_uuid": {"$regex": audio_uuid}}) - logger.info(f"DELETE DEBUG: Contains substring query result: {contains_result is not None}") + conversation = await chunks_col.find_one({"audio_uuid": audio_uuid})
420-429: Use logger.exception for better error tracking.When catching exceptions during file deletion, use
logger.exceptioninstead oflogger.warningto preserve the full stack trace.Apply this diff to improve error logging:
except Exception as e: - logger.warning(f"Failed to delete audio file {audio_path}: {e}") + logger.exception(f"Failed to delete audio file {audio_path}") if cropped_audio_path: try: # Construct full path to cropped audio file full_cropped_path = Path("/app/audio_chunks") / cropped_audio_path if full_cropped_path.exists(): full_cropped_path.unlink() deleted_files.append(str(full_cropped_path)) logger.info(f"Deleted cropped audio file: {full_cropped_path}") except Exception as e: - logger.warning(f"Failed to delete cropped audio file {cropped_audio_path}: {e}") + logger.exception(f"Failed to delete cropped audio file {cropped_audio_path}")Also applies to: 431-439
503-509: Consider extracting configuration values to constants.The hardcoded transcription provider "deepgram" and other configuration values should be extracted to configuration settings or constants for maintainability.
Consider creating a configuration module or using environment variables:
# At the top of the file DEFAULT_TRANSCRIPTION_PROVIDER = os.getenv("TRANSCRIPTION_PROVIDER", "deepgram") # Then in the function: config_data = { "audio_path": str(full_audio_path), "transcription_provider": DEFAULT_TRANSCRIPTION_PROVIDER, "trigger": "manual_reprocess" }
550-633: Handle optional transcript_version_id parameter.Following the suggestion for the route, update the controller to handle an optional
transcript_version_idparameter that defaults to the active version.Apply this diff to handle optional parameter:
-async def reprocess_memory(audio_uuid: str, transcript_version_id: str, user: User): +async def reprocess_memory(audio_uuid: str, transcript_version_id: Optional[str], user: User): """Reprocess memory extraction for a specific transcript version. Users can only reprocess their own conversations.""" try: # Find the conversation chunk = await chunks_col.find_one({"audio_uuid": audio_uuid}) if not chunk: return JSONResponse(status_code=404, content={"error": "Conversation not found"}) # Check ownership for non-admin users if not user.is_superuser: if not client_belongs_to_user(chunk["client_id"], user.user_id): return JSONResponse(status_code=404, content={"error": "Conversation not found"}) + # Default to active version if not specified + if not transcript_version_id: + transcript_version_id = chunk.get("active_transcript_version") + if not transcript_version_id: + return JSONResponse( + status_code=404, content={"error": "No active transcript version found"} + ) + # Resolve transcript version ID transcript_versions = chunk.get("transcript_versions", []) # Handle special "active" version ID if transcript_version_id == "active": active_version_id = chunk.get("active_transcript_version") if not active_version_id: return JSONResponse( status_code=404, content={"error": "No active transcript version found"} ) transcript_version_id = active_version_id
531-533: Track TODO comments for future implementation.There are TODO comments indicating incomplete integration with ProcessorManager and memory processor. These should be tracked as issues.
Would you like me to create GitHub issues to track these TODO items for integrating with the ProcessorManager and memory processor?
Also applies to: 616-618
backends/advanced/src/advanced_omi_backend/database.py (1)
326-327: Use explicit Optional type annotations.PEP 484 prohibits implicit Optional. Parameters with default value
Noneshould useOptional[T]type annotation.Apply this diff to fix the type annotations:
async def update_memory_processing_status( - self, audio_uuid: str, status: str, error_message: str = None + self, audio_uuid: str, status: str, error_message: Optional[str] = None ): async def update_transcription_status( - self, audio_uuid: str, status: str, provider: str = None, error_message: str = None + self, audio_uuid: str, status: str, provider: Optional[str] = None, error_message: Optional[str] = None ): async def create_transcript_version( self, audio_uuid: str, - segments: list = None, - processing_run_id: str = None, - provider: str = None, - raw_data: dict = None + segments: Optional[list] = None, + processing_run_id: Optional[str] = None, + provider: Optional[str] = None, + raw_data: Optional[dict] = None ) -> Optional[str]: async def create_memory_version( self, audio_uuid: str, transcript_version_id: str, - memories: list = None, - processing_run_id: str = None + memories: Optional[list] = None, + processing_run_id: Optional[str] = None ) -> Optional[str]: async def create_run( self, *, audio_uuid: str, run_type: str, # 'transcript' or 'memory' user_id: str, trigger: str, # 'manual_reprocess', 'initial_processing', etc. - config_hash: str = None + config_hash: Optional[str] = None ) -> str: async def update_run_status( self, run_id: str, status: str, - error_message: str = None, - result_version_id: str = None + error_message: Optional[str] = None, + result_version_id: Optional[str] = None ) -> bool:Also applies to: 387-388, 538-541, 570-571, 692-692, 717-718
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
backends/advanced/uv.lockis excluded by!**/*.lockextras/asr-services/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
CLAUDE.md(4 hunks)backends/advanced/scripts/delete_all_conversations_api.py(1 hunks)backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py(4 hunks)backends/advanced/src/advanced_omi_backend/controllers/system_controller.py(2 hunks)backends/advanced/src/advanced_omi_backend/database.py(6 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py(2 hunks)backends/advanced/src/advanced_omi_backend/transcript_coordinator.py(5 hunks)backends/advanced/src/advanced_omi_backend/transcription.py(7 hunks)backends/advanced/src/advanced_omi_backend/transcription_providers.py(1 hunks)backends/advanced/upload_files.py(3 hunks)backends/advanced/webui/src/pages/Conversations.tsx(6 hunks)backends/advanced/webui/src/services/api.ts(1 hunks)extras/asr-services/.dockerignore(1 hunks)extras/asr-services/.env.template(1 hunks)extras/asr-services/docker-compose.yml(1 hunks)extras/asr-services/enhanced_chunking.py(1 hunks)extras/asr-services/parakeet-offline.py(7 hunks)extras/asr-services/pyproject.toml(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-09T02:03:55.976Z
Learnt from: CR
PR: AnkushMalaker/friend-lite#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-09T02:03:55.976Z
Learning: Applies to **/.dockerignore : Docker projects use .dockerignore with an exclude pattern; explicitly include required files so they are sent to the Docker build context
Applied to files:
extras/asr-services/.dockerignore
📚 Learning: 2025-09-09T02:03:55.976Z
Learnt from: CR
PR: AnkushMalaker/friend-lite#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-09T02:03:55.976Z
Learning: Applies to backends/advanced/src/main.py : For /ws_pcm, implement full Wyoming protocol sessioning: only process audio chunks between audio-start and audio-stop events to define conversation segments
Applied to files:
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
📚 Learning: 2025-09-09T02:03:55.976Z
Learnt from: CR
PR: AnkushMalaker/friend-lite#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-09T02:03:55.976Z
Learning: Applies to backends/advanced/src/**/*.py : Store conversations in the MongoDB "audio_chunks" collection; transcripts must be arrays of segments with fields: text, speaker, start, end
Applied to files:
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.pybackends/advanced/src/advanced_omi_backend/database.pybackends/advanced/src/advanced_omi_backend/transcription.py
🪛 Ruff (0.12.2)
backends/advanced/src/advanced_omi_backend/transcription_providers.py
557-557: Use raise without specifying exception name
Remove exception name
(TRY201)
backends/advanced/scripts/delete_all_conversations_api.py
82-82: Loop control variable client_id not used within loop body
Rename unused client_id to _client_id
(B007)
121-121: f-string without any placeholders
Remove extraneous f prefix
(F541)
extras/asr-services/enhanced_chunking.py
62-62: Unused method argument: keep_logits
(ARG002)
145-145: Consider moving this statement to an else block
(TRY300)
147-147: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
148-148: Use raise without specifying exception name
Remove exception name
(TRY201)
296-296: Loop control variable i not used within loop body
Rename unused i to _i
(B007)
354-354: Consider moving this statement to an else block
(TRY300)
357-357: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
358-358: Use raise without specifying exception name
Remove exception name
(TRY201)
425-425: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
449-449: Do not catch blind exception: Exception
(BLE001)
450-450: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
462-462: Do not catch blind exception: Exception
(BLE001)
463-463: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
466-466: Consider moving this statement to an else block
(TRY300)
468-468: Do not catch blind exception: Exception
(BLE001)
469-469: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
499-499: f-string without any placeholders
Remove extraneous f prefix
(F541)
515-515: f-string without any placeholders
Remove extraneous f prefix
(F541)
530-530: f-string without any placeholders
Remove extraneous f prefix
(F541)
541-541: f-string without any placeholders
Remove extraneous f prefix
(F541)
551-551: f-string without any placeholders
Remove extraneous f prefix
(F541)
562-562: f-string without any placeholders
Remove extraneous f prefix
(F541)
595-595: Consider moving this statement to an else block
(TRY300)
598-598: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py
88-88: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
97-97: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
108-108: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
118-118: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
126-126: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
134-134: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py
343-343: Use explicit conversion flag
Replace with conversion flag
(RUF010)
427-427: Do not catch blind exception: Exception
(BLE001)
438-438: Do not catch blind exception: Exception
(BLE001)
452-452: Do not catch blind exception: Exception
(BLE001)
453-453: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
456-456: Use explicit conversion flag
Replace with conversion flag
(RUF010)
545-545: Do not catch blind exception: Exception
(BLE001)
546-546: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
630-630: Do not catch blind exception: Exception
(BLE001)
631-631: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
665-665: Do not catch blind exception: Exception
(BLE001)
666-666: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
697-697: Do not catch blind exception: Exception
(BLE001)
698-698: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
720-720: Do not catch blind exception: Exception
(BLE001)
721-721: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
extras/asr-services/parakeet-offline.py
148-148: Avoid specifying long messages outside the exception class
(TRY003)
196-196: Do not catch blind exception: Exception
(BLE001)
289-289: Use raise without specifying exception name
Remove exception name
(TRY201)
557-557: Do not catch blind exception: Exception
(BLE001)
559-559: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/database.py
168-168: f-string without any placeholders
Remove extraneous f prefix
(F541)
326-326: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
365-365: f-string without any placeholders
Remove extraneous f prefix
(F541)
366-366: f-string without any placeholders
Remove extraneous f prefix
(F541)
373-373: f-string without any placeholders
Remove extraneous f prefix
(F541)
387-387: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
387-387: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
428-428: f-string without any placeholders
Remove extraneous f prefix
(F541)
429-429: f-string without any placeholders
Remove extraneous f prefix
(F541)
434-434: f-string without any placeholders
Remove extraneous f prefix
(F541)
439-439: f-string without any placeholders
Remove extraneous f prefix
(F541)
454-454: Redefinition of unused update_transcript from line 270
(F811)
502-502: Redefinition of unused store_raw_transcript_data from line 185
(F811)
538-538: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
539-539: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
540-540: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
541-541: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
570-570: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
571-571: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
692-692: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
717-717: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
718-718: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
backends/advanced/src/advanced_omi_backend/transcript_coordinator.py
58-58: Create your own exception
(TRY002)
58-58: Avoid specifying long messages outside the exception class
(TRY003)
backends/advanced/src/advanced_omi_backend/transcription.py
106-106: Do not catch blind exception: Exception
(BLE001)
108-108: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
153-153: Use raise without specifying exception name
Remove exception name
(TRY201)
🪛 Gitleaks (8.27.2)
CLAUDE.md
[high] 665-666: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
[high] 698-698: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
[high] 704-704: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
[high] 708-709: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
[high] 721-721: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
[high] 725-726: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
[high] 731-732: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
[high] 737-737: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
[high] 741-742: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
🪛 dotenv-linter (3.3.0)
extras/asr-services/.env.template
[warning] 9-9: [UnorderedKey] The PARAKEET_CONTAINER_PORT key should go before the PARAKEET_HOST_PORT key
(UnorderedKey)
[warning] 27-27: [EndingBlankLine] No blank line at the end of the file
(EndingBlankLine)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: integration-tests
🔇 Additional comments (16)
extras/asr-services/.dockerignore (1)
4-4: Good call un-ignoring enhanced_chunking.py.This ensures the feature ships with the image given the blanket “*” ignore.
If enhanced_chunking.py imports any local helpers/modules, make sure those files are also explicitly un-ignored, or they won’t reach the build context.
backends/advanced/webui/src/pages/Conversations.tsx (2)
438-440: Nice debug detail.The explicit [start, end, duration] readout is useful when validating cropping and diarization.
118-126: Confirmed: backend accepts 'active' as transcript_version_id
conversation_controller treats "active" as a sentinel and substitutes chunk["active_transcript_version"] (backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py:566–573).backends/advanced/upload_files.py (2)
377-391: Good: CLI surfaced with base URL parameterization.
parse_args()is clear and covers the primary use-cases. No issues here.
289-293: Double-check job status route naming. Submission endpoint is POST /process-audio-files-async (backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py) but the controller returns "status_url": "/api/process-audio-files/jobs/{job_id}" (backends/advanced/src/advanced_omi_backend/controllers/system_controller.py:482). I couldn't find a route handler for /api/process-audio-files/jobs/{job_id}; confirm a polling GET exists or update the returned status_url to the actual polling endpoint.backends/advanced/webui/src/services/api.ts (1)
78-87: Approve — reprocessing API routes & params verifiedRouter endpoints and controller signatures exist and align with the frontend; reprocess_memory accepts transcript_version_id. Files: backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py, backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py.
extras/asr-services/enhanced_chunking.py (1)
100-109: Align chunk offsets to featurizer frames; map encoder timesteps with feature_index = chunk_offset + t * subsampling_factorextras/asr-services/enhanced_chunking.py (lines 100–109) — chunk_offsets should be cumulative featurizer frames (feat_signal_len). When joining RNNT word timestamps (encoder timesteps t) compute:
feature_index = chunk_feature_offset + t * subsampling_factor
time_seconds = feature_index * featurizer.window_stride_seconds
Do not divide chunk offsets by subsampling_factor; instead multiply encoder timesteps and ensure chunk_feature_offset is the cumulative feature frames (or floor(chunk_start_samples / window_stride_samples)). Confirm downstream timestamp-joining code uses this mapping.backends/advanced/src/advanced_omi_backend/transcript_coordinator.py (1)
27-39: Use a specific exception type for transcript failuresRaising bare Exception complicates upstream handling. Introduce TranscriptionFailedError and raise that.
[ suggest_recommended_refactor ]
+class TranscriptionFailedError(Exception): + """Raised when transcription fails for a given audio UUID.""" + pass @@ - raise Exception(f"Transcription failed: {error_msg}") + raise TranscriptionFailedError(error_msg)extras/asr-services/parakeet-offline.py (1)
58-64: Config consistency: unused chunking constantsCHUNK_DURATION_SECONDS, OVERLAP_DURATION_SECONDS, CONFIDENCE_THRESHOLD are defined in extras/asr-services/parakeet-offline.py (lines 58–64) but not referenced — either pass them into the enhanced_chunking call (chunk duration, overlap, confidence) or remove them to avoid config drift.
To re-check usages run: rg -n --hidden -S 'CHUNK_DURATION_SECONDS|OVERLAP_DURATION_SECONDS|CONFIDENCE_THRESHOLD'
backends/advanced/src/advanced_omi_backend/transcription.py (1)
313-315: Cropping pipeline present — no change neededprocessors.py defines AudioCroppingItem, cropping_queue, queue_cropping and starts a cropping processor; transcription enqueues via get_processor_manager().queue_cropping, so keeping AUDIO_CROPPING_ENABLED default "true" is acceptable.
backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (2)
86-91: LGTM! Well-structured reprocessing endpoint.The endpoint follows the established pattern with proper authentication and delegation to the controller layer.
104-111: Well-implemented version management endpoints.All endpoints follow consistent patterns with proper authorization and clear documentation.
Also applies to: 114-121, 124-129, 132-137
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (1)
200-221: Good defensive programming for file path resolution.The implementation properly checks multiple possible locations for the audio file and provides helpful error messages with searched paths. This is robust error handling.
backends/advanced/src/advanced_omi_backend/database.py (3)
66-98: Well-designed versioning architecture.The implementation properly initializes version data when creating chunks and maintains backward compatibility with the existing
transcriptandmemoriesfields. The approach of having active versions that sync to compatibility fields is elegant.Also applies to: 108-122
594-628: Excellent version activation logic.The activation methods properly verify version existence, update active pointers, and sync compatibility fields. This ensures backward compatibility while providing versioning capabilities.
Also applies to: 630-662
679-748: Well-structured ProcessingRunsRepository.The repository provides comprehensive tracking of processing runs with appropriate fields for status, errors, and results. The implementation follows good patterns.
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
Show resolved
Hide resolved
| # Check if transcription failed - immediately fail the job | ||
| if transcription_status == "FAILED": | ||
| audio_logger.error( | ||
| f"❌ [Job {job_id}] Transcription failed, marking file as failed: {filename}" | ||
| ) | ||
| await job_tracker.update_file_status( | ||
| job_id, filename, FileStatus.FAILED, | ||
| error_message="Transcription failed" | ||
| ) | ||
| break # Exit monitoring loop for this file | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Failure path triggers is correct; avoid post-failure memory triggers later
You break out on FAILED here, but later still call close_current_conversation (Lines 727–729), which can enqueue memory despite failure.
- if transcription_status == "FAILED":
+ if transcription_status == "FAILED":
audio_logger.error(
f"❌ [Job {job_id}] Transcription failed, marking file as failed: {filename}"
)
await job_tracker.update_file_status(
job_id, filename, FileStatus.FAILED,
error_message="Transcription failed"
)
- break # Exit monitoring loop for this file
+ memory_triggered = True # prevent memory close later
+ break # Exit monitoring loop for this fileAnd guard the unconditional close near the end:
-# Signal end of conversation - trigger memory processing
-await client_state.close_current_conversation()
+if not memory_triggered:
+ await client_state.close_current_conversation()Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
around lines 681-691 and 727-729, the code breaks out of the monitoring loop
when transcription_status == "FAILED" but later still unconditionally calls
close_current_conversation (lines ~727-729) which can enqueue memory for a
failed file; update the logic to skip or return before calling
close_current_conversation when the file/job is marked FAILED (e.g., check
transcription_status or query the file status and only call
close_current_conversation when not FAILED), ensuring no memory enqueuing
happens after a failure.
| # Temporarily disable timestamps to avoid char_offsets error | ||
| old_compute_timestamps = getattr(self.asr_model.decoding, 'compute_timestamps', False) | ||
| self.asr_model.decoding.compute_timestamps = False | ||
|
|
||
| hypotheses = self.asr_model.decoding.rnnt_decoder_predictions_tensor( | ||
| encoder_output=encoded, encoded_lengths=encoded_len, | ||
| return_hypotheses=True # Get hypothesis objects even without timestamps | ||
| ) | ||
|
|
||
| # Restore original setting | ||
| self.asr_model.decoding.compute_timestamps = old_compute_timestamps | ||
|
|
||
| # Store hypotheses with chunk offset tracking |
There was a problem hiding this comment.
Timestamps disabled during decode defeats timestamp preservation.
You set compute_timestamps=True in __init__ but immediately disable it around decode. As a result, hypotheses may lack words/timestamp data and downstream merging/extraction devolves to heuristics. Keep timestamps enabled for RNNT word timing.
- # Temporarily disable timestamps to avoid char_offsets error
- old_compute_timestamps = getattr(self.asr_model.decoding, 'compute_timestamps', False)
- self.asr_model.decoding.compute_timestamps = False
-
hypotheses = self.asr_model.decoding.rnnt_decoder_predictions_tensor(
encoder_output=encoded, encoded_lengths=encoded_len,
return_hypotheses=True # Get hypothesis objects even without timestamps
)
-
- # Restore original setting
- self.asr_model.decoding.compute_timestamps = old_compute_timestamps
+ # Note: compute_timestamps remains enabled (set in __init__) with rnnt_timestamp_type='word'If you were working around char-offset issues, keeping rnnt_timestamp_type='word' avoids char offsets while preserving word timestamps.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Temporarily disable timestamps to avoid char_offsets error | |
| old_compute_timestamps = getattr(self.asr_model.decoding, 'compute_timestamps', False) | |
| self.asr_model.decoding.compute_timestamps = False | |
| hypotheses = self.asr_model.decoding.rnnt_decoder_predictions_tensor( | |
| encoder_output=encoded, encoded_lengths=encoded_len, | |
| return_hypotheses=True # Get hypothesis objects even without timestamps | |
| ) | |
| # Restore original setting | |
| self.asr_model.decoding.compute_timestamps = old_compute_timestamps | |
| # Store hypotheses with chunk offset tracking | |
| hypotheses = self.asr_model.decoding.rnnt_decoder_predictions_tensor( | |
| encoder_output=encoded, encoded_lengths=encoded_len, | |
| return_hypotheses=True # Get hypothesis objects even without timestamps | |
| ) | |
| # Note: compute_timestamps remains enabled (set in __init__) with rnnt_timestamp_type='word' | |
| # Store hypotheses with chunk offset tracking |
🤖 Prompt for AI Agents
In extras/asr-services/enhanced_chunking.py around lines 83 to 95, the code
temporarily sets self.asr_model.decoding.compute_timestamps = False before
calling rnnt_decoder_predictions_tensor which removes word timestamp data;
revert this change so compute_timestamps remains True during decoding, and if
char-offset issues were the concern set the RNN-T timestamp mode to word-level
(e.g. rnnt_timestamp_type='word' or the model's equivalent) so you preserve word
timing while avoiding char offsets; remove the temporary disable/restore and
ensure the decoder call is configured to produce word timestamps.
| # Save audio to temporary file for NeMo processing | ||
| with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile: | ||
| tmpfile_name = tmpfile.name | ||
|
|
||
| # Convert audio chunks to numpy array and save as WAV | ||
| audio_arrays = [] | ||
| for chunk in speech: | ||
| if chunk.width == 2: | ||
| audio_array = np.frombuffer(chunk.audio, dtype=np.int16).astype(np.float32) / np.iinfo(np.int16).max | ||
| elif chunk.width == 4: | ||
| audio_array = np.frombuffer(chunk.audio, dtype=np.int32).astype(np.float32) / np.iinfo(np.int32).max | ||
| else: | ||
| raise ValueError(f"Unsupported width: {chunk.width}") | ||
| audio_arrays.append(audio_array) | ||
|
|
||
| # Concatenate and save as WAV file | ||
| full_audio = np.concatenate(audio_arrays) | ||
|
|
||
| # Convert to int16 for WAV format | ||
| audio_int16 = (full_audio * np.iinfo(np.int16).max).astype(np.int16) | ||
|
|
||
| with wave.open(tmpfile_name, 'wb') as wav_file: | ||
| wav_file.setnchannels(1) # Mono | ||
| wav_file.setsampwidth(2) # 16-bit | ||
| wav_file.setframerate(self._rate) | ||
| wav_file.writeframes(audio_int16.tobytes()) | ||
|
|
||
| try: |
There was a problem hiding this comment.
Critical: Chunked path writes 16 kHz WAV without enforcing input format → timestamp drift
_transcribe_chunked writes with self._rate=16k but doesn’t assert/resample incoming AudioChunks. If any chunk has rate≠16k, channels≠1, or width≠2, the saved WAV will be temporally wrong (speed/pitch), corrupting timestamps returned by the chunker.
Apply format validation and on‑the‑fly resampling before concatenation:
async def _transcribe_chunked(self, speech: Sequence[AudioChunk]) -> dict:
"""Chunked transcription method for long audio."""
logger.info("Using chunked transcription for long audio")
- # Save audio to temporary file for NeMo processing
+ # Normalize to 16kHz/mono/16‑bit if needed
+ needs_resampling = any(
+ (c.rate != self._rate) or (c.channels != 1) or (c.width != 2) for c in speech
+ )
+ if needs_resampling:
+ logger.info("Resampling chunks to 16kHz mono for chunked transcription")
+ resampler = ResamplingBlock(resample_rate=self._rate, resample_channels=1)
+ speech_16k: List[AudioChunk] = []
+ for c in speech:
+ async for rc in resampler.process_chunk(c):
+ speech_16k.append(rc)
+ speech = speech_16k
+
+ # Save audio to temporary file for NeMo processing
with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile:
tmpfile_name = tmpfile.name📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Save audio to temporary file for NeMo processing | |
| with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile: | |
| tmpfile_name = tmpfile.name | |
| # Convert audio chunks to numpy array and save as WAV | |
| audio_arrays = [] | |
| for chunk in speech: | |
| if chunk.width == 2: | |
| audio_array = np.frombuffer(chunk.audio, dtype=np.int16).astype(np.float32) / np.iinfo(np.int16).max | |
| elif chunk.width == 4: | |
| audio_array = np.frombuffer(chunk.audio, dtype=np.int32).astype(np.float32) / np.iinfo(np.int32).max | |
| else: | |
| raise ValueError(f"Unsupported width: {chunk.width}") | |
| audio_arrays.append(audio_array) | |
| # Concatenate and save as WAV file | |
| full_audio = np.concatenate(audio_arrays) | |
| # Convert to int16 for WAV format | |
| audio_int16 = (full_audio * np.iinfo(np.int16).max).astype(np.int16) | |
| with wave.open(tmpfile_name, 'wb') as wav_file: | |
| wav_file.setnchannels(1) # Mono | |
| wav_file.setsampwidth(2) # 16-bit | |
| wav_file.setframerate(self._rate) | |
| wav_file.writeframes(audio_int16.tobytes()) | |
| try: | |
| # Normalize to 16kHz/mono/16-bit if needed | |
| needs_resampling = any( | |
| (c.rate != self._rate) or (c.channels != 1) or (c.width != 2) for c in speech | |
| ) | |
| if needs_resampling: | |
| logger.info("Resampling chunks to 16kHz mono for chunked transcription") | |
| resampler = ResamplingBlock(resample_rate=self._rate, resample_channels=1) | |
| speech_16k: List[AudioChunk] = [] | |
| for c in speech: | |
| async for rc in resampler.process_chunk(c): | |
| speech_16k.append(rc) | |
| speech = speech_16k | |
| # Save audio to temporary file for NeMo processing | |
| with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile: | |
| tmpfile_name = tmpfile.name | |
| # Convert audio chunks to numpy array and save as WAV | |
| audio_arrays = [] | |
| for chunk in speech: | |
| if chunk.width == 2: | |
| audio_array = np.frombuffer(chunk.audio, dtype=np.int16).astype(np.float32) / np.iinfo(np.int16).max | |
| elif chunk.width == 4: | |
| audio_array = np.frombuffer(chunk.audio, dtype=np.int32).astype(np.float32) / np.iinfo(np.int32).max | |
| else: | |
| raise ValueError(f"Unsupported width: {chunk.width}") | |
| audio_arrays.append(audio_array) | |
| # Concatenate and save as WAV file | |
| full_audio = np.concatenate(audio_arrays) | |
| # Convert to int16 for WAV format | |
| audio_int16 = (full_audio * np.iinfo(np.int16).max).astype(np.int16) | |
| with wave.open(tmpfile_name, 'wb') as wav_file: | |
| wav_file.setnchannels(1) # Mono | |
| wav_file.setsampwidth(2) # 16-bit | |
| wav_file.setframerate(self._rate) | |
| wav_file.writeframes(audio_int16.tobytes()) | |
| try: |
🧰 Tools
🪛 Ruff (0.12.2)
148-148: Avoid specifying long messages outside the exception class
(TRY003)
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
CLAUDE.md (1)
840-851: Docs/API mismatch: activation endpoints differ from routerDocs show POST /api/conversations/{conversation_id}/activate-transcript and /activate-memory with JSON bodies. Router implements POST /{conversation_id}/activate-transcript/{version_id} and /activate-memory/{version_id} (path params, no body). Align one side.
Apply one of:
- Update docs to include {version_id} in path and remove body payloads.
- Or change routes to accept JSON payloads as documented.
backends/advanced/src/advanced_omi_backend/database.py (3)
20-23: Mongo DB selection API misuseAsyncIOMotorClient.get_default_database() doesn’t take a name; this call will crash. Use get_database(name) or subscript.
-MONGODB_URI = os.getenv("MONGODB_URI", "mongodb://mongo:27017") -mongo_client = AsyncIOMotorClient(MONGODB_URI) -db = mongo_client.get_default_database("friend-lite") +MONGODB_URI = os.getenv("MONGODB_URI", "mongodb://mongo:27017") +mongo_client = AsyncIOMotorClient(MONGODB_URI) +# If the URI doesn't embed a default DB, select explicitly: +db = mongo_client.get_database("friend-lite")
273-300: Legacy-only updates diverge from versioned source of truthupdate_transcript/update_segment_timing/update_segment_speaker modify legacy transcript only. Callers reading via active version won't see edits.
- await self.col.update_one( - {"audio_uuid": audio_uuid}, {"$set": {"transcript": full_transcript}} - ) + # Update both: active version's segments and legacy field + chunk = await self.get_chunk(audio_uuid) + active = chunk.get("active_transcript_version") + await self.col.update_one( + {"audio_uuid": audio_uuid}, + { + "$set": {"transcript": full_transcript}, + **( + {"$set": {f"transcript_versions.$[v].segments": full_transcript}} + if active else {} + ), + }, + array_filters=([{"v.version_id": active}] if active else None), + )Repeat the same pattern for timing/speaker setters targeting transcript_versions.$[v].segments.{index}.
188-201: Missing helper used by TranscriptionManagerTranscriptionManager calls chunk_repo.update_transcription_status(...), but this repo doesn't provide it anymore per your summary. Reintroduce a thin compatibility method.
class AudioChunksRepository: @@ async def store_raw_transcript_data(self, audio_uuid, raw_data, provider): """Store raw transcript data from transcription provider.""" await self.col.update_one( {"audio_uuid": audio_uuid}, { "$set": { "raw_transcript_data": { "provider": provider, "data": raw_data, "stored_at": datetime.now(UTC).isoformat(), } } }, ) + + async def update_transcription_status( + self, + audio_uuid: str, + status: str, + *, + provider: Optional[str] = None, + error_message: Optional[str] = None, + ) -> bool: + """Compatibility: update chunk-level transcription status fields.""" + update = { + "transcription_status": status, + "transcription_updated_at": datetime.now(UTC).isoformat(), + } + if provider: + update["transcription_provider"] = provider + if error_message: + update["transcription_error"] = error_message + result = await self.col.update_one({"audio_uuid": audio_uuid}, {"$set": update}) + return result.modified_count > 0backends/advanced/src/advanced_omi_backend/transcription.py (1)
206-215: Calls missing repo method: update_transcription_statusThis method no longer existed; add it back (see proposed repo diff) or switch to ConversationsRepository.update_transcript_processing_status when a conversation_id exists.
backends/advanced/Docs/architecture.md (1)
963-974: Docs/Routes mismatch: activation endpointsList shows POST /{conversation_id}/activate-transcript and /activate-memory (no version_id). Router uses /activate-transcript/{version_id} and /activate-memory/{version_id}. Align docs to code or vice versa.
-│ ├── /{conversation_id}/activate-transcript # Switch transcript version -│ └── /{conversation_id}/activate-memory # Switch memory version +│ ├── /{conversation_id}/activate-transcript/{version_id} # Switch transcript version +│ └── /{conversation_id}/activate-memory/{version_id} # Switch memory version
🧹 Nitpick comments (15)
CLAUDE.md (2)
690-740: Versioned payload examples: field names driftExamples use "transcript" inside transcript_versions objects, while code uses "segments". Standardize field naming across docs and code to avoid client breakage.
877-955: Markdown lint: add languages on unlabeled code fences and blank lines around tablesA few fenced code blocks lack language; the comparison table needs blank lines. This fixes MD040/MD058.
backends/advanced/src/advanced_omi_backend/database.py (4)
129-179: Active-version writes update legacy list but not version metadataadd_transcript_segment pushes to transcript_versions[active].segments and legacy transcript. Good. Consider also updating version’s status/provider when segments arrive (e.g., to IN_PROGRESS/COMPLETED) for observability.
301-326: Timestamp type inconsistency (datetime vs ISO strings)You mix datetime objects (cropped_at) with ISO strings elsewhere (created_at, updated_at). Pick one (suggest ISO strings for JSON).
- "cropped_at": datetime.now(UTC), + "cropped_at": datetime.now(UTC).isoformat(),Also standardize updated_at fields across repo.
328-388: Status updates: type consistency and missing completed_atupdated_at uses datetime object while other timestamps are ISO strings; completed_at stored only when status == COMPLETED (good) but updated_at type differs.
- f"memory_versions.$[version].updated_at": datetime.now(UTC), - "memory_processing_status": status, - "memory_processing_updated_at": datetime.now(UTC).isoformat(), + f"memory_versions.$[version].updated_at": datetime.now(UTC).isoformat(), + "memory_processing_status": status, + "memory_processing_updated_at": datetime.now(UTC).isoformat(),
544-552: Type hints: implicit Optional[list]Parameters like segments: list = None violate RUF013. Use Optional[List[T]].
- segments: list = None, + segments: Optional[list] = None,Apply similarly to raw_data, memories, processing_run_id, provider, etc.
backends/advanced/webui/src/pages/Conversations.tsx (2)
88-94: Dropdown closes on any click, including insideYou’re closing on document click unconditionally; menu actions still work but this can cause flicker. Guard for outside clicks via a ref.contains check.
- useEffect(() => { - const handleClickOutside = () => setOpenDropdown(null) - document.addEventListener('click', handleClickOutside) - return () => document.removeEventListener('click', handleClickOutside) - }, []) + const dropdownRef = useRef<HTMLDivElement | null>(null) + useEffect(() => { + const onDocClick = (e: MouseEvent) => { + if (openDropdown && dropdownRef.current && !dropdownRef.current.contains(e.target as Node)) { + setOpenDropdown(null) + } + } + document.addEventListener('click', onDocClick) + return () => document.removeEventListener('click', onDocClick) + }, [openDropdown])And set ref on the menu container.
381-442: Add a11y labels to icon-only buttonsProvide aria-labels for the More menu and items to improve accessibility.
- <button + <button + aria-label="Conversation options" @@ - <button + <button + aria-label="Reprocess transcript" @@ - <button + <button + aria-label="Reprocess memory" @@ - <button + <button + aria-label="Delete conversation"backends/advanced/src/advanced_omi_backend/transcription.py (2)
408-410: Speaker segment text logged at INFOThese include transcript snippets. Downgrade to DEBUG.
- logger.info(f"🔍 DEBUG: Segment {i}: text='{seg.get('text', 'MISSING')}', speaker={seg.get('speaker', 'UNKNOWN')}") + logger.debug("segment %d: speaker=%s", i, seg.get('speaker', 'UNKNOWN'))
451-453: absolute_timestamp origin likely incorrectUsing time.time() + start yields “now + offset”, not the audio’s real wall-clock time. Prefer session_start (from audio chunk timestamp).
- "absolute_timestamp": time.time() + segment.get("start", 0.0), + "absolute_timestamp": ( + (await self.chunk_repo.get_chunk(self._current_audio_uuid)).get("timestamp", 0) + + segment.get("start", 0.0) + ),Cache chunk once outside the loop.
backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (1)
113-131: Route shape vs. docsRoutes use /{conversation_id}/activate-transcript/{version_id} and /activate-memory/{version_id}. Ensure docs and frontend align (frontend currently calls without version in path in some places per docs).
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (4)
75-76: Wrong user identifier propertyUse user.user_id consistently; user.id may be None.
- logger.info(f"Manually closed conversation for client {client_id} by user {user.id}") + logger.info(f"Manually closed conversation for client {client_id} by user {user.user_id}")
335-382: Segment updates ignore versioned sourceupdate_transcript_segment only edits legacy transcript. Also update the active version’s segment using arrayFilters.
- result = await chunks_col.update_one({"audio_uuid": audio_uuid}, {"$set": update_doc}) + # Update legacy + result = await chunks_col.update_one({"audio_uuid": audio_uuid}, {"$set": update_doc}) + # Update versioned path too + chunk = await chunks_col.find_one({"audio_uuid": audio_uuid}, {"active_transcript_version": 1}) + active = chunk.get("active_transcript_version") if chunk else None + if active: + version_updates = {} + if speaker_id is not None: + version_updates[f"transcript_versions.$[v].segments.{segment_index}.speaker"] = speaker_id + if start_time is not None: + version_updates[f"transcript_versions.$[v].segments.{segment_index}.start"] = start_time + if end_time is not None: + version_updates[f"transcript_versions.$[v].segments.{segment_index}.end"] = end_time + if version_updates: + await chunks_col.update_one( + {"audio_uuid": audio_uuid}, + {"$set": version_updates}, + array_filters=[{"v.version_id": active}], + )
568-575: Deterministic config_hashHashing str(dict) can vary if dict assembly changes. Use stable JSON with sort_keys.
- config_hash = hashlib.sha256(str(config_data).encode()).hexdigest()[:16] + import json + config_hash = hashlib.sha256(json.dumps(config_data, sort_keys=True).encode()).hexdigest()[:16]Apply to memory reprocess too.
700-703: Use logger.exception for tracebacksPrefer logger.exception in broad excepts to retain stack traces.
- logger.error(f"Error starting memory reprocessing: {e}") + logger.exception("Error starting memory reprocessing")Apply similarly to other broad excepts.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CLAUDE.md(7 hunks)backends/advanced/Docs/architecture.md(8 hunks)backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py(7 hunks)backends/advanced/src/advanced_omi_backend/database.py(8 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py(2 hunks)backends/advanced/src/advanced_omi_backend/transcription.py(5 hunks)backends/advanced/webui/src/pages/Conversations.tsx(7 hunks)backends/advanced/webui/src/services/api.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backends/advanced/webui/src/services/api.ts
🧰 Additional context used
🪛 Ruff (0.12.2)
backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py
97-97: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
106-106: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
117-117: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
127-127: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
135-135: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
143-143: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py
462-462: Do not catch blind exception: Exception
(BLE001)
611-611: Do not catch blind exception: Exception
(BLE001)
612-612: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
700-700: Do not catch blind exception: Exception
(BLE001)
701-701: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
735-735: Do not catch blind exception: Exception
(BLE001)
736-736: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
767-767: Do not catch blind exception: Exception
(BLE001)
768-768: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
790-790: Do not catch blind exception: Exception
(BLE001)
791-791: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/database.py
171-171: f-string without any placeholders
Remove extraneous f prefix
(F541)
368-368: f-string without any placeholders
Remove extraneous f prefix
(F541)
369-369: f-string without any placeholders
Remove extraneous f prefix
(F541)
376-376: f-string without any placeholders
Remove extraneous f prefix
(F541)
547-547: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
548-548: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
549-549: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
550-550: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
579-579: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
580-580: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
691-691: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
692-692: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
726-726: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
752-752: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
753-753: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
🪛 markdownlint-cli2 (0.17.2)
CLAUDE.md
212-212: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
751-751: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
945-945: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (11)
backends/advanced/src/advanced_omi_backend/database.py (3)
250-271: LGTM: version-aware reads with legacy fallbackReading from active version then falling back to transcript is correct and safe.
603-637: Activation also needs legacy fields update (OK) — consider denormalization guardThis sets legacy transcript and speakers from version; good. Consider wrapping both fields under a single $set to avoid partial updates in failures (you already do).
687-710: Processing status on conversation: good, but mirror to audio_chunks if UI still reads itIf any UI still displays transcription_status from audio_chunks, consider syncing here or remove that UI dependency.
backends/advanced/webui/src/pages/Conversations.tsx (1)
565-567: Time display: avoid double conversion for short segmentsformatDuration already pads seconds; showing start/end with toFixed(1) is fine. LGTM.
backends/advanced/src/advanced_omi_backend/transcription.py (1)
523-527: Status update path should be version-aware or chunk-levelYou set chunk transcription_status; if moving fully to versioned conversations, also call conversations_repo.update_transcript_processing_status(conversation_id, status, provider).
backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (1)
103-111: Query default “active” is a good DX affordanceAccepting transcript_version_id=active by default keeps UI simple. LGTM.
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (4)
400-418: Helpful debug, but avoid substring/regex queries in normal pathDebug-only alt queries are fine behind DEBUG guard; you already gate most. LGTM.
465-488: Delete audio from both possible rootsReprocess uses two roots (/app/data/audio_chunks and /app/audio_chunks). Delete checks only one. Try both.
[suggest_minor_issue]- if audio_path: + if audio_path: try: - # Construct full path to audio file - full_audio_path = Path("/app/audio_chunks") / audio_path - if full_audio_path.exists(): - full_audio_path.unlink() + for root in (Path("/app/audio_chunks"), Path("/app/data/audio_chunks")): + full_audio_path = root / audio_path + if full_audio_path.exists(): + full_audio_path.unlink() + deleted_files.append(str(full_audio_path)) - deleted_files.append(str(full_audio_path)) logger.info(f"Deleted audio file: {full_audio_path}") except Exception as e: logger.warning(f"Failed to delete audio file {audio_path}: {e}")Repeat for cropped path.
597-609: WIP: reprocess queue not wiredEndpoints create versions and runs but don’t enqueue processing. Gate the routes behind a feature flag or return “ACCEPTED” with a clear “not yet processed” note. Ensure UI handles PENDING.
518-526: Ownership checks and error codes look correctLean and consistent. LGTM.
backends/advanced/Docs/architecture.md (1)
646-747: Mermaid blocks are great; keep them labeled (they are).No action; diagram matches narrative.
| provider_name = self.provider.name if self.provider else "unknown" | ||
| logger.info(f"🔍 DEBUG: transcript_result type={type(transcript_result)}, content preview: {str(transcript_result)[:200]}") | ||
| if self.chunk_repo: | ||
| logger.info(f"🔍 DEBUG: About to store raw transcript data for {self._current_audio_uuid}") | ||
| await self.chunk_repo.store_raw_transcript_data( | ||
| self._current_audio_uuid, transcript_result, provider_name | ||
| ) | ||
| logger.info(f"🔍 DEBUG: Successfully stored raw transcript data for {self._current_audio_uuid}") | ||
|
|
There was a problem hiding this comment.
Transcripts in logs at INFO may leak PII
Logging raw transcript previews at INFO is a privacy risk. Use DEBUG and redact/shorten.
- logger.info(f"🔍 DEBUG: transcript_result type={type(transcript_result)}, content preview: {str(transcript_result)[:200]}")
+ logger.debug("transcript_result type=%s, preview=%s", type(transcript_result), str(transcript_result)[:120])
@@
- logger.info(f"🔍 DEBUG: About to store raw transcript data for {self._current_audio_uuid}")
+ logger.debug("About to store raw transcript data for %s", self._current_audio_uuid)
@@
- logger.info(f"🔍 DEBUG: Successfully stored raw transcript data for {self._current_audio_uuid}")
+ logger.debug("Stored raw transcript data for %s", self._current_audio_uuid)Also consider a config flag to disable raw storage in non-dev.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| provider_name = self.provider.name if self.provider else "unknown" | |
| logger.info(f"🔍 DEBUG: transcript_result type={type(transcript_result)}, content preview: {str(transcript_result)[:200]}") | |
| if self.chunk_repo: | |
| logger.info(f"🔍 DEBUG: About to store raw transcript data for {self._current_audio_uuid}") | |
| await self.chunk_repo.store_raw_transcript_data( | |
| self._current_audio_uuid, transcript_result, provider_name | |
| ) | |
| logger.info(f"🔍 DEBUG: Successfully stored raw transcript data for {self._current_audio_uuid}") | |
| provider_name = self.provider.name if self.provider else "unknown" | |
| logger.debug("transcript_result type=%s, preview=%s", type(transcript_result), str(transcript_result)[:120]) | |
| if self.chunk_repo: | |
| logger.debug("About to store raw transcript data for %s", self._current_audio_uuid) | |
| await self.chunk_repo.store_raw_transcript_data( | |
| self._current_audio_uuid, transcript_result, provider_name | |
| ) | |
| logger.debug("Stored raw transcript data for %s", self._current_audio_uuid) |
🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/transcription.py around lines 283
to 291, change the INFO-level logs that print raw transcript previews to DEBUG
and redact/shorten the transcript content (e.g., show only a hashed id, length,
or first N characters with ellipsis) to avoid leaking PII; additionally, gate
the call to chunk_repo.store_raw_transcript_data behind a configuration flag
(e.g., STORE_RAW_TRANSCRIPTS or ENV check) so raw transcript storage is disabled
by default in non-development environments and only enabled when explicitly
configured.
Summary by CodeRabbit
New Features
Changes
Documentation