Skip to content

Reprocessing WIP#96

Merged
AnkushMalaker merged 5 commits intomainfrom
reprocessing
Sep 18, 2025
Merged

Reprocessing WIP#96
AnkushMalaker merged 5 commits intomainfrom
reprocessing

Conversation

@AnkushMalaker
Copy link
Collaborator

@AnkushMalaker AnkushMalaker commented Sep 15, 2025

Summary by CodeRabbit

  • New Features

    • Versioned transcripts and memories with active version management and history.
    • Reprocessing endpoints for transcripts and memories; activate specific versions.
    • Web UI dropdown per conversation: Reprocess Transcript, Reprocess Memory, Delete Conversation.
    • Enhanced long-audio transcription with accurate word/segment timestamps.
    • Speech-driven conversation creation (only when speech is detected).
  • Changes

    • Conversations now identified by conversation_id (not audio_uuid); delete preserves original audio.
    • Audio cropping enabled by default.
  • Documentation

    • Updated architecture for speech-driven, versioned processing.
    • Added API testing guide, data structure examples, and integration notes.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 15, 2025

Walkthrough

Introduces versioned conversations with transcript/memory versions, new processing-runs tracking, and reprocessing/activation endpoints keyed by conversation_id. Updates controllers, routers, database schema/repositories, transcription flow, and UI/API client to support version management. Adds enhanced ASR chunking with timestamp preservation. Documentation updated for architecture and APIs.

Changes

Cohort / File(s) Summary
Docs & Spec Update
CLAUDE.md, backends/advanced/Docs/architecture.md
Documents new conversations collection, versioned transcripts/memories, updated/added endpoints, schema changes, speech-driven flow, and testing guide.
Backend Controllers & Routes
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py, backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py
Adds reprocess/activate/version-history endpoints using conversation_id; implements ownership checks, path resolution, config-hash deduping, and structured responses; reworks delete to operate via audio_chunk and optionally remove conversation.
Database & Repositories
backends/advanced/src/advanced_omi_backend/database.py
Introduces ProcessingRunsRepository; adds versioned transcript/memory storage and active pointers; methods to create/activate versions, store raw data, get history; seeds versions on chunk creation; exposes processing_runs collection.
Transcription Pipeline
backends/advanced/src/advanced_omi_backend/transcription.py
Integrates with versioned model: creates/activates initial transcript version, stores segments to active version, logs raw transcript storage; enables cropping by default; adjusts diarization logging and conversation fields.
Web UI + API Client
backends/advanced/webui/src/pages/Conversations.tsx, backends/advanced/webui/src/services/api.ts
Adds per-conversation dropdown for reprocess transcript/memory and delete; wires to new APIs; loading/disabled states; fetch refresh; exposes new API methods for version ops.
ASR Enhancements
extras/asr-services/enhanced_chunking.py
Adds timestamp-preserving chunked RNNT transcription with hypothesis merging; utilities for extracting timestamps; async end-to-end transcribe_with_enhanced_chunking.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant UI as Web UI
  participant API as Conversations Router
  participant CTRL as Conversation Controller
  participant DB as Conversations/AudioChunks
  participant PR as ProcessingRuns Repo
  participant Q as Processing Queue

  UI->>API: POST /api/conversations/{conversation_id}/reprocess-transcript
  API->>CTRL: reprocess_transcript(conversation_id, user)
  CTRL->>DB: Validate ownership, fetch conversation/audio
  CTRL->>CTRL: Resolve audio path, compute config_hash
  CTRL->>PR: create_run(run_type="transcript", status=PENDING)
  CTRL->>DB: create_transcript_version(processing_run_id)
  CTRL->>Q: enqueue transcript processing (conversation_id, version_id)
  CTRL-->>UI: { run_id, version_id, config_hash, status: PENDING }
Loading
sequenceDiagram
  autonumber
  participant UI as Web UI
  participant API as Conversations Router
  participant CTRL as Conversation Controller
  participant DB as Conversations Repo

  UI->>API: POST /api/conversations/{conversation_id}/activate-transcript/{version_id}
  API->>CTRL: activate_transcript_version(...)
  CTRL->>DB: Set active_transcript_version, sync legacy fields
  CTRL-->>UI: { activated: true, version_id }
Loading
sequenceDiagram
  autonumber
  participant UI as Web UI
  participant API as Conversations Router
  participant CTRL as Conversation Controller
  participant AC as AudioChunks Repo
  participant CV as Conversations Repo
  participant FS as File Storage

  UI->>API: DELETE /api/conversations/{audio_uuid}
  API->>CTRL: delete_conversation(audio_uuid, user)
  CTRL->>AC: Fetch audio_chunk, verify ownership
  CTRL->>AC: Delete audio_chunk
  alt conversation linked
    CTRL->>CV: Delete conversation (optional)
  end
  CTRL->>FS: Delete related audio files
  CTRL-->>UI: { deleted_audio_chunk: true, deleted_conversation: bool, files: [...] }
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Possibly related PRs

  • long audio fix #95 — Also modifies conversation lifecycle and delete_conversation logic at the controller layer, likely overlapping in control flow and endpoint semantics.

Poem

A nibble of bytes, a hop through time,
Versions stack neatly—what a climb!
I thump “reprocess,” watch runs queue,
Memories switch—a freshened view.
Timestamps stitched, long trails align—
Conversations chime, refined, divine.
🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Reprocessing WIP" is directly related to the core changes in this PR (addition of reprocessing flows, versioned transcript/memory APIs, and related endpoints), so it summarizes the main topic concisely; however the "WIP" suffix is informal and does not describe the scope or the specific APIs introduced.
Docstring Coverage ✅ Passed Docstring coverage is 93.48% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch reprocessing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@AnkushMalaker
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 15, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
backends/advanced/src/advanced_omi_backend/transcription.py (1)

309-315: Double memory queueing: this reintroduces the duplicate memory bug

You call _queue_memory_processing here, while the controllers also close conversations to trigger memory. Remove this call or guard via config/flag to avoid duplicate memories.

-            # Queue memory processing now that transcription is complete
-            await self._queue_memory_processing()
+            # Memory processing is triggered by conversation closure; avoid duplicate queueing here.
+            # (Left intentionally empty)
backends/advanced/src/advanced_omi_backend/database.py (1)

270-274: Remove duplicate method definitions.

The methods update_transcript (line 270) and store_raw_transcript_data (line 185) are defined twice in the class, which will cause the first definitions to be shadowed.

Apply this diff to remove the duplicate definitions:

-    async def update_transcript(self, audio_uuid, full_transcript):
-        """Update the entire transcript list (for compatibility)."""
-        await self.col.update_one(
-            {"audio_uuid": audio_uuid}, {"$set": {"transcript": full_transcript}}
-        )

     async def update_segment_timing(self, audio_uuid, segment_index, start_time, end_time):

And remove the first store_raw_transcript_data method at line 185-198.

Also applies to: 454-500

🧹 Nitpick comments (45)
extras/asr-services/pyproject.toml (1)

18-19: Pin httpx upper bound to avoid surprise breakages; verify NumPy bounds consistency across groups.

  • Add an upper bound for httpx to stay within the current major until you intentionally upgrade.
  • In [dependency-groups.parakeet] NumPy is capped at <2.0 but the main deps are not; consider aligning to avoid resolver conflicts when both groups are used.

Apply this diff for httpx:

-  "easy-audio-interfaces>=0.7.1",
-  "httpx>=0.28.1",
+  "easy-audio-interfaces>=0.7.1",
+  "httpx>=0.28.1,<1.0",

Outside the shown hunk, consider updating the main NumPy dep to align with the parakeet group:

# in [project.dependencies]
"numpy>=1.26,<2.0"
extras/asr-services/.env.template (2)

9-10: Key order nit: place PARAKEET_CONTAINER_PORT before PARAKEET_HOST_PORT.

Keeps dotenv-linter happy and maintains consistent grouping.

Proposed snippet:

PARAKEET_CONTAINER_PORT=8765
PARAKEET_HOST_PORT=8767

11-27: Default values here diverge from docker-compose; align or document why.

  • .env.template uses 120s/10s/120s while docker-compose defaults are 30s/5s/60s. This can confuse operators.
  • The comment recommends 20–40s but sets 120s.

Suggested defaults to match docker-compose and the guidance:

CHUNKING_ENABLED=true
CHUNK_DURATION_SECONDS=30.0
OVERLAP_DURATION_SECONDS=5.0
MIN_AUDIO_FOR_CHUNKING=60.0
CONFIDENCE_THRESHOLD=0.8

Also ensure the file ends with a trailing newline for tooling compatibility.

backends/advanced/scripts/delete_all_conversations_api.py (6)

18-29: Make base URL configurable via env; avoid hard-coding localhost.

Allows targeting non-local deployments and CI/staging.

Apply this diff:

-
-
+BASE_URL = os.getenv("BACKEND_BASE_URL", "http://localhost:8000")
@@
-    base_url = "http://localhost:8000"
+    base_url = BASE_URL

30-49: Add client timeouts to aiohttp sessions to prevent hangs; minor robustness.

Network calls can stall without bounds.

Apply this diff:

-    async with aiohttp.ClientSession() as session:
+    timeout = aiohttp.ClientTimeout(total=float(os.getenv("HTTP_TIMEOUT_SECONDS", "30")))
+    async with aiohttp.ClientSession(timeout=timeout) as session:
@@
-            if response.status != 200:
+            if response.status != 200:
                 print(f"Failed to login: {response.status}")
                 text = await response.text()
                 print(f"Response: {text}")
                 sys.exit(1)

Repeat the same timeout usage for the session in delete_all_conversations (see below).


54-70: Reuse BASE_URL and apply the same session timeout.

Consistent configuration and resiliency.

-    base_url = "http://localhost:8000"
+    base_url = BASE_URL
@@
-    async with aiohttp.ClientSession() as session:
+    timeout = aiohttp.ClientTimeout(total=float(os.getenv("HTTP_TIMEOUT_SECONDS", "30")))
+    async with aiohttp.ClientSession(timeout=timeout) as session:

82-86: Silence the unused loop variable warning and improve clarity.

Rename the unused key to underscore-prefixed.

-        for client_id, client_conversations in conversations_dict.items():
-            conversations.extend(client_conversations)
+        for _client_id, client_conversations in conversations_dict.items():
+            conversations.extend(client_conversations)

113-121: Treat 200/202/204 as success for DELETE; fix f-string nit.

APIs commonly return 204 No Content on deletion.

-                if response.status == 200:
+                if response.status in (200, 202, 204):
                     deleted_count += 1
                     print(f"Deleted conversation {audio_uuid} ({deleted_count}/{len(conversations)})")
                 else:
                     failed_count += 1
                     text = await response.text()
                     print(f"Failed to delete {audio_uuid}: {response.status} - {text}")
@@
-        print(f"\nDeletion complete:")
+        print("\nDeletion complete:")

Optional: batch deletions with bounded concurrency (e.g., asyncio.Semaphore) to speed up large purges without overloading the API.


91-97: Confirmation UX: accept common variants.

Consider accepting “y/yes” and defaulting to “no” on empty input.

-            response = input(f"Are you sure you want to delete ALL {len(conversations)} conversations? (yes/no): ")
-            if response.lower() != "yes":
+            response = input(f"Are you sure you want to delete ALL {len(conversations)} conversations? [y/N]: ").strip().lower()
+            if response not in ("y", "yes"):
                 print("Deletion cancelled")
                 return
extras/asr-services/docker-compose.yml (1)

24-29: Align defaults with .env.template (or explain the difference).

Current compose defaults (30/5/60) differ from .env.template (120/10/120). Pick one source of truth to avoid surprise behavior.

If aligning to the shorter chunks:

-      - CHUNKING_ENABLED=${CHUNKING_ENABLED:-true}
-      - CHUNK_DURATION_SECONDS=${CHUNK_DURATION_SECONDS:-30.0}
-      - OVERLAP_DURATION_SECONDS=${OVERLAP_DURATION_SECONDS:-5.0}
-      - MIN_AUDIO_FOR_CHUNKING=${MIN_AUDIO_FOR_CHUNKING:-60.0}
-      - CONFIDENCE_THRESHOLD=${CONFIDENCE_THRESHOLD:-0.8}
+      - CHUNKING_ENABLED=${CHUNKING_ENABLED:-true}
+      - CHUNK_DURATION_SECONDS=${CHUNK_DURATION_SECONDS:-30.0}
+      - OVERLAP_DURATION_SECONDS=${OVERLAP_DURATION_SECONDS:-5.0}
+      - MIN_AUDIO_FOR_CHUNKING=${MIN_AUDIO_FOR_CHUNKING:-60.0}
+      - CONFIDENCE_THRESHOLD=${CONFIDENCE_THRESHOLD:-0.8}

Optionally reference an env_file here to centralize configuration.

backends/advanced/webui/src/services/api.ts (1)

80-82: Type the version selector explicitly.

transcriptVersionId often uses the sentinel "active". Consider a union type to self-document this contract.

- reprocessMemory: (audioUuid: string, transcriptVersionId: string) => api.post(`/api/conversations/${audioUuid}/reprocess-memory`, null, {
+ reprocessMemory: (audioUuid: string, transcriptVersionId: 'active' | string) => api.post(`/api/conversations/${audioUuid}/reprocess-memory`, null, {
backends/advanced/webui/src/pages/Conversations.tsx (3)

76-81: Dropdown close handler may also close when interacting inside the menu.

Currently any document click closes the menu. If you want to allow clicks inside without closing, guard against clicks originating within the dropdown root.

-  useEffect(() => {
-    const handleClickOutside = () => setOpenDropdown(null)
-    document.addEventListener('click', handleClickOutside)
-    return () => document.removeEventListener('click', handleClickOutside)
-  }, [])
+  useEffect(() => {
+    const handleClickOutside = (e: MouseEvent) => {
+      const target = e.target as HTMLElement
+      if (!target.closest('[data-dropdown-root="conversation-actions"]')) {
+        setOpenDropdown(null)
+      }
+    }
+    document.addEventListener('click', handleClickOutside)
+    return () => document.removeEventListener('click', handleClickOutside)
+  }, [])

And add data-dropdown-root="conversation-actions" to the dropdown container.


296-337: Add a root marker to support click-outside guards.

If you keep the global click listener, tag the dropdown root for precise containment checks.

- <div className="absolute right-0 top-8 w-48 ... py-2 z-10">
+ <div data-dropdown-root="conversation-actions" className="absolute right-0 top-8 w-48 ... py-2 z-10">

341-378: Clear segment timer on natural audio end.

Prevents a stray timeout from firing after the audio ends by itself.

-      audio.addEventListener('ended', () => {
-        setPlayingSegment(null);
-      });
+      audio.addEventListener('ended', () => {
+        if (segmentTimerRef.current) {
+          window.clearTimeout(segmentTimerRef.current);
+          segmentTimerRef.current = null;
+        }
+        setPlayingSegment(null);
+      });
backends/advanced/upload_files.py (3)

410-444: Prefer server-provided status_url and fail fast on missing files.

  • Use status_url from the submission response if present to avoid client/route drift.
  • File existence handling is good; keep the early exit.
-        logger.info(f"🔗 Status URL: {job_data.get('status_url', 'N/A')}")
+        status_url = job_data.get("status_url")
+        logger.info(f"🔗 Status URL: {status_url or 'N/A'}")
 ...
-        return poll_job_status(job_id, token, base_url, total_files)
+        return poll_job_status(job_id, token, base_url, total_files) if not status_url \
+          else poll_job_status(job_id, token, base_url, total_files)  # Backward compat if poller doesn't accept URL yet

If you’re open to touching the poller, pass status_url explicitly and only compute a fallback when it’s missing.


249-252: Avoid closing file handles twice.

You close in the normal path and again in finally. Keep only the finally to simplify and remove benign warnings.

-        # Close all file handles
-        for _, file_tuple in files_data:
-            file_tuple[1].close()

191-199: Comment and behavior diverge.

Comment says "filter out files over 20 minutes" but the filter is removed (intentionally per PR summary). Update the comment to avoid confusion.

-    # Check duration and filter out files over 20 minutes
+    # Check duration for logging; no duration-based filtering applied
CLAUDE.md (2)

662-671: Sanitize example tokens to stop secret scanners from flagging false positives.

Replace bearer tokens and JWT-like strings with obvious placeholders.

-    # Step 2: Get auth token
+    # Step 2: Get auth token
     # Step 3: Use token in file upload
-    curl -X POST \
-      -H "Authorization: Bearer YOUR_TOKEN_HERE" \
+    curl -X POST \
+      -H "Authorization: Bearer <TOKEN>" \
       -F "files=@/path/to/audio.wav" \
       -F "device_name=test-upload" \
       http://localhost:8000/api/process-audio-files

Also replace all occurrences of eyJ... example JWTs with <TOKEN> and headers with Authorization: Bearer <TOKEN> across this section to appease Gitleaks.


741-746: Docs/routes mismatch for transcript activation.

Frontend calls POST /api/conversations/{audio_uuid}/activate-transcript/{version_id}; this doc shows a JSON body with -d and no path param. Align docs with the implemented route.

-  -d '{"transcript_version_id": "VERSION_ID"}' \
-  http://localhost:8000/api/conversations/YOUR_AUDIO_UUID/activate-transcript
+  http://localhost:8000/api/conversations/YOUR_AUDIO_UUID/activate-transcript/VERSION_ID

Do the same for memory activation for consistency.

extras/asr-services/enhanced_chunking.py (5)

62-69: Silence linter about unused parameter without changing parent signature.

Match parent signature but explicitly mark keep_logits unused.

-    def _get_batch_preds(self, keep_logits=False):
+    def _get_batch_preds(self, keep_logits=False):
+        _ = keep_logits  # unused, required by parent signature

146-149: Prefer logging.exception and bare raise for trace preservation.

Keep the original traceback and avoid shadowing the exception.

-        except Exception as e:
-            logger.error(f"Hypothesis joining FAILED: {e}")
-            raise e  # Don't silently fall back
+        except Exception:
+            logger.exception("Hypothesis joining FAILED")
+            raise

Apply similarly to other catch blocks in this module.


295-355: Timestamp extraction: fallbacks are OK; prefer the native utilities when available.

This function duplicates logic already covered by extract_timestamps_from_hypotheses_native. Consider consolidating to one path and removing the blind except Exception swallowing at the end.

-    except Exception as e:
-        logger.error(f"Native timestamp extraction FAILED: {e}")
-        raise e  # Don't silently fall back
+    except Exception:
+        logger.exception("Native timestamp extraction FAILED")
+        raise

And either remove extract_timestamps_from_hypotheses or make it a thin wrapper that calls the native variant and only falls back when strictly necessary.


489-512: Remove f-strings without placeholders.

These trigger linters; they’re plain strings.

-            logger.debug(f"Initializing NeMo chunked processor...")
+            logger.debug("Initializing NeMo chunked processor...")
 ...
-            logger.debug(f"Loading audio file into chunker...")
+            logger.debug("Loading audio file into chunker...")

Repeat for the other similar lines later in this function.


589-599: Use logging.exception and re-raise.

Avoid losing stack traces on failure.

-        logger.error(f"Enhanced chunking failed: {e}")
-        raise
+        logger.exception("Enhanced chunking failed")
+        raise
extras/asr-services/parakeet-offline.py (5)

137-162: Portability: Avoid double-opening the same NamedTemporaryFile handle

You open wave.open(tmpfile_name, 'wb') while the NamedTemporaryFile handle is still open. Works on Linux, brittle on Windows. Close before wave.open or don’t keep the handle.

-with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile:
-    tmpfile_name = tmpfile.name
+with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile:
+    tmpfile_name = tmpfile.name
+    pass  # only to reserve a unique path
+# Reopen separately to write WAV
 with wave.open(tmpfile_name, 'wb') as wav_file:

243-264: Harden word timestamp extraction

Directly indexing result.timestamp['word'] can raise if structure differs. Guard presence to avoid 500s.

-# Extract word-level timestamps - NeMo Parakeet format
-words = []
-for word_data in result.timestamp['word']:
+words = []
+ts = getattr(result, "timestamp", None)
+if isinstance(ts, dict) and "word" in ts and isinstance(ts["word"], list):
+  for word_data in ts["word"]:
     word_dict = {
       "word": word_data['word'],
       "start": word_data['start'],
       "end": word_data['end'],
       "confidence": 1.0,
     }
     words.append(word_dict)

286-290: Prefer bare raise and include traceback in logs

Use logger.exception for stack and “raise” to preserve context.

-except Exception as e:
-    logger.error(f"Error during transcription: {e}")
-    # Re-raise the exception so HTTP endpoint can return proper error code
-    raise e
+except Exception:
+    logger.exception("Error during transcription")
+    # Re-raise the exception so HTTP endpoint can return proper error code
+    raise

557-561: Return richer trace on errors in /transcribe handler

Same: use logger.exception for stack.

-except Exception as e:
-    error_time = time.time() - request_start
-    logger.error(f"🕐 TIMING: Error occurred after {error_time:.3f}s - {e}")
-    raise HTTPException(status_code=500, detail=str(e))
+except Exception:
+    error_time = time.time() - request_start
+    logger.exception(f"🕐 TIMING: Error occurred after {error_time:.3f}s")
+    raise HTTPException(status_code=500, detail="Transcription failed")

273-281: Reduce transcript content logging to avoid PII leakage and log bloat

Logging first 100 chars and full keys at INFO can leak user content and inflate logs. Gate behind DEBUG.

-logger.info(f"🔍 PARAKEET RESPONSE DEBUG:")
-logger.info(f"  - Text length: {len(response_data['text'])}")
-logger.info(f"  - First 100 chars: {repr(response_data['text'][:100])}")
-logger.info(f"  - Words count: {len(response_data['words'])}")
-logger.info(f"  - First 3 words: {response_data['words'][:3] if response_data['words'] else 'None'}")
-logger.info(f"  - Full response keys: {list(response_data.keys())}")
+logger.debug("🔍 PARAKEET RESPONSE DEBUG:")
+logger.debug(f"  - Text length: {len(response_data['text'])}")
+logger.debug(f"  - First 100 chars: {repr(response_data['text'][:100])}")
+logger.debug(f"  - Words count: {len(response_data['words'])}")
+logger.debug(f"  - First 3 words: {response_data['words'][:3] if response_data['words'] else 'None'}")
+logger.debug(f"  - Full response keys: {list(response_data.keys())}")
backends/advanced/src/advanced_omi_backend/transcription_providers.py (2)

555-558: Preserve traceback: use bare raise

Re-raising with “raise e” drops context.

-except Exception as e:
-    logger.error(f"Error calling Parakeet service: {e}")
-    raise e
+except Exception:
+    logger.exception("Error calling Parakeet service")
+    raise

529-546: Consider dynamic timeouts for long audio uploads

Hard‑coded 180s may be short for multi‑minute WAVs; you already use size‑based timeouts for Deepgram. Mirror that approach here.

If desired, I can propose a size‑aware httpx.Timeout similar to Deepgram’s path.

backends/advanced/src/advanced_omi_backend/transcript_coordinator.py (1)

87-106: Thread-safety/readability nits

Setting transcript_failures without lock is okay for CPython but wrap in lock for parity with event creation and to ease future maintenance.

-    def signal_transcript_failed(self, audio_uuid: str, error_message: str):
+    async def signal_transcript_failed(self, audio_uuid: str, error_message: str):
@@
-        # Store the failure message
-        self.transcript_failures[audio_uuid] = error_message
+        async with self._lock:
+            self.transcript_failures[audio_uuid] = error_message

Note: adjust call sites to await.

backends/advanced/src/advanced_omi_backend/transcription.py (6)

101-118: Propagate full traceback and keep coordinator signaling

Use logger.exception for context. This also aligns with TRY400.

-        except Exception as e:
-            # Signal transcription failure
-            logger.error(f"Transcription failed for {self._current_audio_uuid}: {e}")
+        except Exception:
+            # Signal transcription failure
+            logger.exception(f"Transcription failed for {self._current_audio_uuid}")
             if self._current_audio_uuid:
                 # Update database status to FAILED
                 if self.chunk_repo:
                     await self.chunk_repo.update_transcription_status(
-                        self._current_audio_uuid, "FAILED", error_message=str(e)
+                        self._current_audio_uuid, "FAILED", error_message="transcription failed"
                     )
                 # Signal coordinator about failure
                 coordinator = get_transcript_coordinator()
-                coordinator.signal_transcript_failed(self._current_audio_uuid, str(e))
+                coordinator.signal_transcript_failed(self._current_audio_uuid, "transcription failed")

147-154: Preserve traceback on provider error

Re-raising with “raise e” loses context.

-        except Exception as e:
-            logger.error(f"Error getting transcript from {self.provider.name}: {e}")
+        except Exception:
+            logger.exception(f"Error getting transcript from {self.provider.name}")
             # Clean up buffer before re-raising
             self._audio_buffer.clear()
             self._audio_start_time = None
             self._collecting = False
-            raise e
+            raise

271-299: Segment storage: absolute_timestamp derivation is weak

Using time.time() + start yields non-deterministic absolute timestamps. Prefer deriving from the first chunk’s timestamp or persisted conversation start.

If the chunk_repo exposes a base timestamp, I can wire it here. Otherwise, consider storing relative timestamps only.


59-62: Comment drift

Comment says memory processing is handled exclusively by conversation closure and method removed, but _queue_memory_processing still exists and is called (later). Update comments/code to match.


559-567: Reconnect path: add jitter/backoff

On repeated provider errors, a fixed 2s wait can hot‑loop. Consider exponential backoff with cap.


353-361: Normalization robustness

If provider returns unexpected types, you return empty dict silently. Consider logging at WARN and include provider name to ease debugging.

backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (1)

95-101: Consider making transcript_version_id optional with a sensible default.

Currently, transcript_version_id is required (Query(...)), but given that users might want to reprocess the currently active transcript version, consider making it optional and defaulting to the active version in the controller.

Apply this diff to make the parameter optional with a default:

 async def reprocess_memory(
     audio_uuid: str,
     current_user: User = Depends(current_active_user),
-    transcript_version_id: str = Query(...)
+    transcript_version_id: str = Query(None, description="Transcript version ID (defaults to active version)")
 ):
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (5)

340-383: Remove excessive debug logging in production code.

The extensive debug logging (Lines 340-383) appears to be troubleshooting code that should not be in production. This level of logging could impact performance and fill up logs unnecessarily.

Apply this diff to remove the debug logging:

 async def delete_conversation(audio_uuid: str, user: User):
     """Delete a conversation and its associated audio file. Users can only delete their own conversations."""
     try:
-        # DEBUG: Log comprehensive details about the search
-        logger.info(f"DELETE DEBUG: Starting delete for audio_uuid='{audio_uuid}'")
-        logger.info(f"DELETE DEBUG: audio_uuid type={type(audio_uuid)}, length={len(audio_uuid)}")
-        logger.info(f"DELETE DEBUG: audio_uuid repr={repr(audio_uuid)}")
-        logger.info(f"DELETE DEBUG: audio_uuid bytes={audio_uuid.encode('utf-8').hex()}")
-        logger.info(f"DELETE DEBUG: Database collection={chunks_col.name}, database={chunks_col.database.name}")
-        
-        # DEBUG: Test if ANY conversations exist in the collection
-        total_count = await chunks_col.count_documents({})
-        logger.info(f"DELETE DEBUG: Total conversations in collection: {total_count}")
-        
-        # DEBUG: Try to find a few conversations to compare
-        sample_conversations = []
-        async for doc in chunks_col.find({}).limit(3):
-            sample_conversations.append({
-                "audio_uuid": doc.get("audio_uuid"),
-                "audio_uuid_type": type(doc.get("audio_uuid")),
-                "audio_uuid_repr": repr(doc.get("audio_uuid"))
-            })
-        logger.info(f"DELETE DEBUG: Sample conversations: {sample_conversations}")
-        
-        # DEBUG: Execute the exact query we're about to use
-        query = {"audio_uuid": audio_uuid}
-        logger.info(f"DELETE DEBUG: Query being executed: {query}")
-        
         # First, get the conversation to check ownership
-        conversation = await chunks_col.find_one(query)
-        
-        # DEBUG: Log the result
-        logger.info(f"DELETE DEBUG: Query result: {conversation is not None}")
-        if conversation:
-            logger.info(f"DELETE DEBUG: Found conversation with client_id={conversation.get('client_id')}")
-        else:
-            # DEBUG: Try alternative queries to see what might work
-            logger.info("DELETE DEBUG: Trying alternative queries...")
-            
-            # Try with regex (case insensitive)
-            regex_result = await chunks_col.find_one({"audio_uuid": {"$regex": f"^{audio_uuid}$", "$options": "i"}})
-            logger.info(f"DELETE DEBUG: Case-insensitive regex query result: {regex_result is not None}")
-            
-            # Try to find any conversation containing this uuid as substring
-            contains_result = await chunks_col.find_one({"audio_uuid": {"$regex": audio_uuid}})
-            logger.info(f"DELETE DEBUG: Contains substring query result: {contains_result is not None}")
+        conversation = await chunks_col.find_one({"audio_uuid": audio_uuid})

420-429: Use logger.exception for better error tracking.

When catching exceptions during file deletion, use logger.exception instead of logger.warning to preserve the full stack trace.

Apply this diff to improve error logging:

             except Exception as e:
-                logger.warning(f"Failed to delete audio file {audio_path}: {e}")
+                logger.exception(f"Failed to delete audio file {audio_path}")

         if cropped_audio_path:
             try:
                 # Construct full path to cropped audio file
                 full_cropped_path = Path("/app/audio_chunks") / cropped_audio_path
                 if full_cropped_path.exists():
                     full_cropped_path.unlink()
                     deleted_files.append(str(full_cropped_path))
                     logger.info(f"Deleted cropped audio file: {full_cropped_path}")
             except Exception as e:
-                logger.warning(f"Failed to delete cropped audio file {cropped_audio_path}: {e}")
+                logger.exception(f"Failed to delete cropped audio file {cropped_audio_path}")

Also applies to: 431-439


503-509: Consider extracting configuration values to constants.

The hardcoded transcription provider "deepgram" and other configuration values should be extracted to configuration settings or constants for maintainability.

Consider creating a configuration module or using environment variables:

# At the top of the file
DEFAULT_TRANSCRIPTION_PROVIDER = os.getenv("TRANSCRIPTION_PROVIDER", "deepgram")

# Then in the function:
config_data = {
    "audio_path": str(full_audio_path),
    "transcription_provider": DEFAULT_TRANSCRIPTION_PROVIDER,
    "trigger": "manual_reprocess"
}

550-633: Handle optional transcript_version_id parameter.

Following the suggestion for the route, update the controller to handle an optional transcript_version_id parameter that defaults to the active version.

Apply this diff to handle optional parameter:

-async def reprocess_memory(audio_uuid: str, transcript_version_id: str, user: User):
+async def reprocess_memory(audio_uuid: str, transcript_version_id: Optional[str], user: User):
     """Reprocess memory extraction for a specific transcript version. Users can only reprocess their own conversations."""
     try:
         # Find the conversation
         chunk = await chunks_col.find_one({"audio_uuid": audio_uuid})
         if not chunk:
             return JSONResponse(status_code=404, content={"error": "Conversation not found"})

         # Check ownership for non-admin users
         if not user.is_superuser:
             if not client_belongs_to_user(chunk["client_id"], user.user_id):
                 return JSONResponse(status_code=404, content={"error": "Conversation not found"})

+        # Default to active version if not specified
+        if not transcript_version_id:
+            transcript_version_id = chunk.get("active_transcript_version")
+            if not transcript_version_id:
+                return JSONResponse(
+                    status_code=404, content={"error": "No active transcript version found"}
+                )
+
         # Resolve transcript version ID
         transcript_versions = chunk.get("transcript_versions", [])

         # Handle special "active" version ID
         if transcript_version_id == "active":
             active_version_id = chunk.get("active_transcript_version")
             if not active_version_id:
                 return JSONResponse(
                     status_code=404, content={"error": "No active transcript version found"}
                 )
             transcript_version_id = active_version_id

531-533: Track TODO comments for future implementation.

There are TODO comments indicating incomplete integration with ProcessorManager and memory processor. These should be tracked as issues.

Would you like me to create GitHub issues to track these TODO items for integrating with the ProcessorManager and memory processor?

Also applies to: 616-618

backends/advanced/src/advanced_omi_backend/database.py (1)

326-327: Use explicit Optional type annotations.

PEP 484 prohibits implicit Optional. Parameters with default value None should use Optional[T] type annotation.

Apply this diff to fix the type annotations:

     async def update_memory_processing_status(
-        self, audio_uuid: str, status: str, error_message: str = None
+        self, audio_uuid: str, status: str, error_message: Optional[str] = None
     ):

     async def update_transcription_status(
-        self, audio_uuid: str, status: str, provider: str = None, error_message: str = None
+        self, audio_uuid: str, status: str, provider: Optional[str] = None, error_message: Optional[str] = None
     ):

     async def create_transcript_version(
         self,
         audio_uuid: str,
-        segments: list = None,
-        processing_run_id: str = None,
-        provider: str = None,
-        raw_data: dict = None
+        segments: Optional[list] = None,
+        processing_run_id: Optional[str] = None,
+        provider: Optional[str] = None,
+        raw_data: Optional[dict] = None
     ) -> Optional[str]:

     async def create_memory_version(
         self,
         audio_uuid: str,
         transcript_version_id: str,
-        memories: list = None,
-        processing_run_id: str = None
+        memories: Optional[list] = None,
+        processing_run_id: Optional[str] = None
     ) -> Optional[str]:

     async def create_run(
         self,
         *,
         audio_uuid: str,
         run_type: str,  # 'transcript' or 'memory'
         user_id: str,
         trigger: str,  # 'manual_reprocess', 'initial_processing', etc.
-        config_hash: str = None
+        config_hash: Optional[str] = None
     ) -> str:

     async def update_run_status(
         self,
         run_id: str,
         status: str,
-        error_message: str = None,
-        result_version_id: str = None
+        error_message: Optional[str] = None,
+        result_version_id: Optional[str] = None
     ) -> bool:

Also applies to: 387-388, 538-541, 570-571, 692-692, 717-718

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 37d2e03 and 85139d9.

⛔ Files ignored due to path filters (2)
  • backends/advanced/uv.lock is excluded by !**/*.lock
  • extras/asr-services/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (18)
  • CLAUDE.md (4 hunks)
  • backends/advanced/scripts/delete_all_conversations_api.py (1 hunks)
  • backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (4 hunks)
  • backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (2 hunks)
  • backends/advanced/src/advanced_omi_backend/database.py (6 hunks)
  • backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (2 hunks)
  • backends/advanced/src/advanced_omi_backend/transcript_coordinator.py (5 hunks)
  • backends/advanced/src/advanced_omi_backend/transcription.py (7 hunks)
  • backends/advanced/src/advanced_omi_backend/transcription_providers.py (1 hunks)
  • backends/advanced/upload_files.py (3 hunks)
  • backends/advanced/webui/src/pages/Conversations.tsx (6 hunks)
  • backends/advanced/webui/src/services/api.ts (1 hunks)
  • extras/asr-services/.dockerignore (1 hunks)
  • extras/asr-services/.env.template (1 hunks)
  • extras/asr-services/docker-compose.yml (1 hunks)
  • extras/asr-services/enhanced_chunking.py (1 hunks)
  • extras/asr-services/parakeet-offline.py (7 hunks)
  • extras/asr-services/pyproject.toml (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-09T02:03:55.976Z
Learnt from: CR
PR: AnkushMalaker/friend-lite#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-09T02:03:55.976Z
Learning: Applies to **/.dockerignore : Docker projects use .dockerignore with an exclude pattern; explicitly include required files so they are sent to the Docker build context

Applied to files:

  • extras/asr-services/.dockerignore
📚 Learning: 2025-09-09T02:03:55.976Z
Learnt from: CR
PR: AnkushMalaker/friend-lite#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-09T02:03:55.976Z
Learning: Applies to backends/advanced/src/main.py : For /ws_pcm, implement full Wyoming protocol sessioning: only process audio chunks between audio-start and audio-stop events to define conversation segments

Applied to files:

  • backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
📚 Learning: 2025-09-09T02:03:55.976Z
Learnt from: CR
PR: AnkushMalaker/friend-lite#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-09T02:03:55.976Z
Learning: Applies to backends/advanced/src/**/*.py : Store conversations in the MongoDB "audio_chunks" collection; transcripts must be arrays of segments with fields: text, speaker, start, end

Applied to files:

  • backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py
  • backends/advanced/src/advanced_omi_backend/database.py
  • backends/advanced/src/advanced_omi_backend/transcription.py
🪛 Ruff (0.12.2)
backends/advanced/src/advanced_omi_backend/transcription_providers.py

557-557: Use raise without specifying exception name

Remove exception name

(TRY201)

backends/advanced/scripts/delete_all_conversations_api.py

82-82: Loop control variable client_id not used within loop body

Rename unused client_id to _client_id

(B007)


121-121: f-string without any placeholders

Remove extraneous f prefix

(F541)

extras/asr-services/enhanced_chunking.py

62-62: Unused method argument: keep_logits

(ARG002)


145-145: Consider moving this statement to an else block

(TRY300)


147-147: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


148-148: Use raise without specifying exception name

Remove exception name

(TRY201)


296-296: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


354-354: Consider moving this statement to an else block

(TRY300)


357-357: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


358-358: Use raise without specifying exception name

Remove exception name

(TRY201)


425-425: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


449-449: Do not catch blind exception: Exception

(BLE001)


450-450: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


462-462: Do not catch blind exception: Exception

(BLE001)


463-463: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


466-466: Consider moving this statement to an else block

(TRY300)


468-468: Do not catch blind exception: Exception

(BLE001)


469-469: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


499-499: f-string without any placeholders

Remove extraneous f prefix

(F541)


515-515: f-string without any placeholders

Remove extraneous f prefix

(F541)


530-530: f-string without any placeholders

Remove extraneous f prefix

(F541)


541-541: f-string without any placeholders

Remove extraneous f prefix

(F541)


551-551: f-string without any placeholders

Remove extraneous f prefix

(F541)


562-562: f-string without any placeholders

Remove extraneous f prefix

(F541)


595-595: Consider moving this statement to an else block

(TRY300)


598-598: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py

88-88: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


97-97: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


108-108: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


118-118: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


126-126: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


134-134: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py

343-343: Use explicit conversion flag

Replace with conversion flag

(RUF010)


427-427: Do not catch blind exception: Exception

(BLE001)


438-438: Do not catch blind exception: Exception

(BLE001)


452-452: Do not catch blind exception: Exception

(BLE001)


453-453: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


456-456: Use explicit conversion flag

Replace with conversion flag

(RUF010)


545-545: Do not catch blind exception: Exception

(BLE001)


546-546: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


630-630: Do not catch blind exception: Exception

(BLE001)


631-631: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


665-665: Do not catch blind exception: Exception

(BLE001)


666-666: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


697-697: Do not catch blind exception: Exception

(BLE001)


698-698: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


720-720: Do not catch blind exception: Exception

(BLE001)


721-721: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

extras/asr-services/parakeet-offline.py

148-148: Avoid specifying long messages outside the exception class

(TRY003)


196-196: Do not catch blind exception: Exception

(BLE001)


289-289: Use raise without specifying exception name

Remove exception name

(TRY201)


557-557: Do not catch blind exception: Exception

(BLE001)


559-559: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

backends/advanced/src/advanced_omi_backend/database.py

168-168: f-string without any placeholders

Remove extraneous f prefix

(F541)


326-326: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


365-365: f-string without any placeholders

Remove extraneous f prefix

(F541)


366-366: f-string without any placeholders

Remove extraneous f prefix

(F541)


373-373: f-string without any placeholders

Remove extraneous f prefix

(F541)


387-387: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


387-387: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


428-428: f-string without any placeholders

Remove extraneous f prefix

(F541)


429-429: f-string without any placeholders

Remove extraneous f prefix

(F541)


434-434: f-string without any placeholders

Remove extraneous f prefix

(F541)


439-439: f-string without any placeholders

Remove extraneous f prefix

(F541)


454-454: Redefinition of unused update_transcript from line 270

(F811)


502-502: Redefinition of unused store_raw_transcript_data from line 185

(F811)


538-538: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


539-539: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


540-540: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


541-541: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


570-570: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


571-571: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


692-692: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


717-717: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


718-718: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)

backends/advanced/src/advanced_omi_backend/transcript_coordinator.py

58-58: Create your own exception

(TRY002)


58-58: Avoid specifying long messages outside the exception class

(TRY003)

backends/advanced/src/advanced_omi_backend/transcription.py

106-106: Do not catch blind exception: Exception

(BLE001)


108-108: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


153-153: Use raise without specifying exception name

Remove exception name

(TRY201)

🪛 Gitleaks (8.27.2)
CLAUDE.md

[high] 665-666: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.

(curl-auth-header)


[high] 698-698: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 704-704: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.

(curl-auth-header)


[high] 708-709: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.

(curl-auth-header)


[high] 721-721: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.

(curl-auth-header)


[high] 725-726: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.

(curl-auth-header)


[high] 731-732: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.

(curl-auth-header)


[high] 737-737: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.

(curl-auth-header)


[high] 741-742: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.

(curl-auth-header)

🪛 dotenv-linter (3.3.0)
extras/asr-services/.env.template

[warning] 9-9: [UnorderedKey] The PARAKEET_CONTAINER_PORT key should go before the PARAKEET_HOST_PORT key

(UnorderedKey)


[warning] 27-27: [EndingBlankLine] No blank line at the end of the file

(EndingBlankLine)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: integration-tests
🔇 Additional comments (16)
extras/asr-services/.dockerignore (1)

4-4: Good call un-ignoring enhanced_chunking.py.

This ensures the feature ships with the image given the blanket “*” ignore.

If enhanced_chunking.py imports any local helpers/modules, make sure those files are also explicitly un-ignored, or they won’t reach the build context.

backends/advanced/webui/src/pages/Conversations.tsx (2)

438-440: Nice debug detail.

The explicit [start, end, duration] readout is useful when validating cropping and diarization.


118-126: Confirmed: backend accepts 'active' as transcript_version_id
conversation_controller treats "active" as a sentinel and substitutes chunk["active_transcript_version"] (backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py:566–573).

backends/advanced/upload_files.py (2)

377-391: Good: CLI surfaced with base URL parameterization.

parse_args() is clear and covers the primary use-cases. No issues here.


289-293: Double-check job status route naming. Submission endpoint is POST /process-audio-files-async (backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py) but the controller returns "status_url": "/api/process-audio-files/jobs/{job_id}" (backends/advanced/src/advanced_omi_backend/controllers/system_controller.py:482). I couldn't find a route handler for /api/process-audio-files/jobs/{job_id}; confirm a polling GET exists or update the returned status_url to the actual polling endpoint.

backends/advanced/webui/src/services/api.ts (1)

78-87: Approve — reprocessing API routes & params verified

Router endpoints and controller signatures exist and align with the frontend; reprocess_memory accepts transcript_version_id. Files: backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py, backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py.

extras/asr-services/enhanced_chunking.py (1)

100-109: Align chunk offsets to featurizer frames; map encoder timesteps with feature_index = chunk_offset + t * subsampling_factor

extras/asr-services/enhanced_chunking.py (lines 100–109) — chunk_offsets should be cumulative featurizer frames (feat_signal_len). When joining RNNT word timestamps (encoder timesteps t) compute:
feature_index = chunk_feature_offset + t * subsampling_factor
time_seconds = feature_index * featurizer.window_stride_seconds
Do not divide chunk offsets by subsampling_factor; instead multiply encoder timesteps and ensure chunk_feature_offset is the cumulative feature frames (or floor(chunk_start_samples / window_stride_samples)). Confirm downstream timestamp-joining code uses this mapping.

backends/advanced/src/advanced_omi_backend/transcript_coordinator.py (1)

27-39: Use a specific exception type for transcript failures

Raising bare Exception complicates upstream handling. Introduce TranscriptionFailedError and raise that.

[ suggest_recommended_refactor ]

+class TranscriptionFailedError(Exception):
+    """Raised when transcription fails for a given audio UUID."""
+    pass
@@
-                raise Exception(f"Transcription failed: {error_msg}")
+                raise TranscriptionFailedError(error_msg)
extras/asr-services/parakeet-offline.py (1)

58-64: Config consistency: unused chunking constants

CHUNK_DURATION_SECONDS, OVERLAP_DURATION_SECONDS, CONFIDENCE_THRESHOLD are defined in extras/asr-services/parakeet-offline.py (lines 58–64) but not referenced — either pass them into the enhanced_chunking call (chunk duration, overlap, confidence) or remove them to avoid config drift.

To re-check usages run: rg -n --hidden -S 'CHUNK_DURATION_SECONDS|OVERLAP_DURATION_SECONDS|CONFIDENCE_THRESHOLD'

backends/advanced/src/advanced_omi_backend/transcription.py (1)

313-315: Cropping pipeline present — no change needed

processors.py defines AudioCroppingItem, cropping_queue, queue_cropping and starts a cropping processor; transcription enqueues via get_processor_manager().queue_cropping, so keeping AUDIO_CROPPING_ENABLED default "true" is acceptable.

backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (2)

86-91: LGTM! Well-structured reprocessing endpoint.

The endpoint follows the established pattern with proper authentication and delegation to the controller layer.


104-111: Well-implemented version management endpoints.

All endpoints follow consistent patterns with proper authorization and clear documentation.

Also applies to: 114-121, 124-129, 132-137

backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (1)

200-221: Good defensive programming for file path resolution.

The implementation properly checks multiple possible locations for the audio file and provides helpful error messages with searched paths. This is robust error handling.

backends/advanced/src/advanced_omi_backend/database.py (3)

66-98: Well-designed versioning architecture.

The implementation properly initializes version data when creating chunks and maintains backward compatibility with the existing transcript and memories fields. The approach of having active versions that sync to compatibility fields is elegant.

Also applies to: 108-122


594-628: Excellent version activation logic.

The activation methods properly verify version existence, update active pointers, and sync compatibility fields. This ensures backward compatibility while providing versioning capabilities.

Also applies to: 630-662


679-748: Well-structured ProcessingRunsRepository.

The repository provides comprehensive tracking of processing runs with appropriate fields for status, errors, and results. The implementation follows good patterns.

Comment on lines +681 to +691
# Check if transcription failed - immediately fail the job
if transcription_status == "FAILED":
audio_logger.error(
f"❌ [Job {job_id}] Transcription failed, marking file as failed: {filename}"
)
await job_tracker.update_file_status(
job_id, filename, FileStatus.FAILED,
error_message="Transcription failed"
)
break # Exit monitoring loop for this file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Failure path triggers is correct; avoid post-failure memory triggers later

You break out on FAILED here, but later still call close_current_conversation (Lines 727–729), which can enqueue memory despite failure.

-                            if transcription_status == "FAILED":
+                            if transcription_status == "FAILED":
                                 audio_logger.error(
                                     f"❌ [Job {job_id}] Transcription failed, marking file as failed: {filename}"
                                 )
                                 await job_tracker.update_file_status(
                                     job_id, filename, FileStatus.FAILED, 
                                     error_message="Transcription failed"
                                 )
-                                break  # Exit monitoring loop for this file
+                                memory_triggered = True  # prevent memory close later
+                                break  # Exit monitoring loop for this file

And guard the unconditional close near the end:

-# Signal end of conversation - trigger memory processing
-await client_state.close_current_conversation()
+if not memory_triggered:
+    await client_state.close_current_conversation()

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
around lines 681-691 and 727-729, the code breaks out of the monitoring loop
when transcription_status == "FAILED" but later still unconditionally calls
close_current_conversation (lines ~727-729) which can enqueue memory for a
failed file; update the logic to skip or return before calling
close_current_conversation when the file/job is marked FAILED (e.g., check
transcription_status or query the file status and only call
close_current_conversation when not FAILED), ensuring no memory enqueuing
happens after a failure.

Comment on lines +83 to +95
# Temporarily disable timestamps to avoid char_offsets error
old_compute_timestamps = getattr(self.asr_model.decoding, 'compute_timestamps', False)
self.asr_model.decoding.compute_timestamps = False

hypotheses = self.asr_model.decoding.rnnt_decoder_predictions_tensor(
encoder_output=encoded, encoded_lengths=encoded_len,
return_hypotheses=True # Get hypothesis objects even without timestamps
)

# Restore original setting
self.asr_model.decoding.compute_timestamps = old_compute_timestamps

# Store hypotheses with chunk offset tracking
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Timestamps disabled during decode defeats timestamp preservation.

You set compute_timestamps=True in __init__ but immediately disable it around decode. As a result, hypotheses may lack words/timestamp data and downstream merging/extraction devolves to heuristics. Keep timestamps enabled for RNNT word timing.

-            # Temporarily disable timestamps to avoid char_offsets error
-            old_compute_timestamps = getattr(self.asr_model.decoding, 'compute_timestamps', False)
-            self.asr_model.decoding.compute_timestamps = False
-
             hypotheses = self.asr_model.decoding.rnnt_decoder_predictions_tensor(
                 encoder_output=encoded, encoded_lengths=encoded_len,
                 return_hypotheses=True  # Get hypothesis objects even without timestamps
             )
-
-            # Restore original setting
-            self.asr_model.decoding.compute_timestamps = old_compute_timestamps
+            # Note: compute_timestamps remains enabled (set in __init__) with rnnt_timestamp_type='word'

If you were working around char-offset issues, keeping rnnt_timestamp_type='word' avoids char offsets while preserving word timestamps.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Temporarily disable timestamps to avoid char_offsets error
old_compute_timestamps = getattr(self.asr_model.decoding, 'compute_timestamps', False)
self.asr_model.decoding.compute_timestamps = False
hypotheses = self.asr_model.decoding.rnnt_decoder_predictions_tensor(
encoder_output=encoded, encoded_lengths=encoded_len,
return_hypotheses=True # Get hypothesis objects even without timestamps
)
# Restore original setting
self.asr_model.decoding.compute_timestamps = old_compute_timestamps
# Store hypotheses with chunk offset tracking
hypotheses = self.asr_model.decoding.rnnt_decoder_predictions_tensor(
encoder_output=encoded, encoded_lengths=encoded_len,
return_hypotheses=True # Get hypothesis objects even without timestamps
)
# Note: compute_timestamps remains enabled (set in __init__) with rnnt_timestamp_type='word'
# Store hypotheses with chunk offset tracking
🤖 Prompt for AI Agents
In extras/asr-services/enhanced_chunking.py around lines 83 to 95, the code
temporarily sets self.asr_model.decoding.compute_timestamps = False before
calling rnnt_decoder_predictions_tensor which removes word timestamp data;
revert this change so compute_timestamps remains True during decoding, and if
char-offset issues were the concern set the RNN-T timestamp mode to word-level
(e.g. rnnt_timestamp_type='word' or the model's equivalent) so you preserve word
timing while avoiding char offsets; remove the temporary disable/restore and
ensure the decoder call is configured to produce word timestamps.

Comment on lines +136 to +163
# Save audio to temporary file for NeMo processing
with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile:
tmpfile_name = tmpfile.name

# Convert audio chunks to numpy array and save as WAV
audio_arrays = []
for chunk in speech:
if chunk.width == 2:
audio_array = np.frombuffer(chunk.audio, dtype=np.int16).astype(np.float32) / np.iinfo(np.int16).max
elif chunk.width == 4:
audio_array = np.frombuffer(chunk.audio, dtype=np.int32).astype(np.float32) / np.iinfo(np.int32).max
else:
raise ValueError(f"Unsupported width: {chunk.width}")
audio_arrays.append(audio_array)

# Concatenate and save as WAV file
full_audio = np.concatenate(audio_arrays)

# Convert to int16 for WAV format
audio_int16 = (full_audio * np.iinfo(np.int16).max).astype(np.int16)

with wave.open(tmpfile_name, 'wb') as wav_file:
wav_file.setnchannels(1) # Mono
wav_file.setsampwidth(2) # 16-bit
wav_file.setframerate(self._rate)
wav_file.writeframes(audio_int16.tobytes())

try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: Chunked path writes 16 kHz WAV without enforcing input format → timestamp drift

_transcribe_chunked writes with self._rate=16k but doesn’t assert/resample incoming AudioChunks. If any chunk has rate≠16k, channels≠1, or width≠2, the saved WAV will be temporally wrong (speed/pitch), corrupting timestamps returned by the chunker.

Apply format validation and on‑the‑fly resampling before concatenation:

 async def _transcribe_chunked(self, speech: Sequence[AudioChunk]) -> dict:
   """Chunked transcription method for long audio."""
   logger.info("Using chunked transcription for long audio")

-  # Save audio to temporary file for NeMo processing
+  # Normalize to 16kHz/mono/16‑bit if needed
+  needs_resampling = any(
+      (c.rate != self._rate) or (c.channels != 1) or (c.width != 2) for c in speech
+  )
+  if needs_resampling:
+      logger.info("Resampling chunks to 16kHz mono for chunked transcription")
+      resampler = ResamplingBlock(resample_rate=self._rate, resample_channels=1)
+      speech_16k: List[AudioChunk] = []
+      for c in speech:
+          async for rc in resampler.process_chunk(c):
+              speech_16k.append(rc)
+      speech = speech_16k
+
+  # Save audio to temporary file for NeMo processing
   with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile:
       tmpfile_name = tmpfile.name
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Save audio to temporary file for NeMo processing
with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile:
tmpfile_name = tmpfile.name
# Convert audio chunks to numpy array and save as WAV
audio_arrays = []
for chunk in speech:
if chunk.width == 2:
audio_array = np.frombuffer(chunk.audio, dtype=np.int16).astype(np.float32) / np.iinfo(np.int16).max
elif chunk.width == 4:
audio_array = np.frombuffer(chunk.audio, dtype=np.int32).astype(np.float32) / np.iinfo(np.int32).max
else:
raise ValueError(f"Unsupported width: {chunk.width}")
audio_arrays.append(audio_array)
# Concatenate and save as WAV file
full_audio = np.concatenate(audio_arrays)
# Convert to int16 for WAV format
audio_int16 = (full_audio * np.iinfo(np.int16).max).astype(np.int16)
with wave.open(tmpfile_name, 'wb') as wav_file:
wav_file.setnchannels(1) # Mono
wav_file.setsampwidth(2) # 16-bit
wav_file.setframerate(self._rate)
wav_file.writeframes(audio_int16.tobytes())
try:
# Normalize to 16kHz/mono/16-bit if needed
needs_resampling = any(
(c.rate != self._rate) or (c.channels != 1) or (c.width != 2) for c in speech
)
if needs_resampling:
logger.info("Resampling chunks to 16kHz mono for chunked transcription")
resampler = ResamplingBlock(resample_rate=self._rate, resample_channels=1)
speech_16k: List[AudioChunk] = []
for c in speech:
async for rc in resampler.process_chunk(c):
speech_16k.append(rc)
speech = speech_16k
# Save audio to temporary file for NeMo processing
with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmpfile:
tmpfile_name = tmpfile.name
# Convert audio chunks to numpy array and save as WAV
audio_arrays = []
for chunk in speech:
if chunk.width == 2:
audio_array = np.frombuffer(chunk.audio, dtype=np.int16).astype(np.float32) / np.iinfo(np.int16).max
elif chunk.width == 4:
audio_array = np.frombuffer(chunk.audio, dtype=np.int32).astype(np.float32) / np.iinfo(np.int32).max
else:
raise ValueError(f"Unsupported width: {chunk.width}")
audio_arrays.append(audio_array)
# Concatenate and save as WAV file
full_audio = np.concatenate(audio_arrays)
# Convert to int16 for WAV format
audio_int16 = (full_audio * np.iinfo(np.int16).max).astype(np.int16)
with wave.open(tmpfile_name, 'wb') as wav_file:
wav_file.setnchannels(1) # Mono
wav_file.setsampwidth(2) # 16-bit
wav_file.setframerate(self._rate)
wav_file.writeframes(audio_int16.tobytes())
try:
🧰 Tools
🪛 Ruff (0.12.2)

148-148: Avoid specifying long messages outside the exception class

(TRY003)

@AnkushMalaker
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 18, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@AnkushMalaker AnkushMalaker merged commit ef59d13 into main Sep 18, 2025
2 checks passed
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
CLAUDE.md (1)

840-851: Docs/API mismatch: activation endpoints differ from router

Docs show POST /api/conversations/{conversation_id}/activate-transcript and /activate-memory with JSON bodies. Router implements POST /{conversation_id}/activate-transcript/{version_id} and /activate-memory/{version_id} (path params, no body). Align one side.

Apply one of:

  • Update docs to include {version_id} in path and remove body payloads.
  • Or change routes to accept JSON payloads as documented.
backends/advanced/src/advanced_omi_backend/database.py (3)

20-23: Mongo DB selection API misuse

AsyncIOMotorClient.get_default_database() doesn’t take a name; this call will crash. Use get_database(name) or subscript.

-MONGODB_URI = os.getenv("MONGODB_URI", "mongodb://mongo:27017")
-mongo_client = AsyncIOMotorClient(MONGODB_URI)
-db = mongo_client.get_default_database("friend-lite")
+MONGODB_URI = os.getenv("MONGODB_URI", "mongodb://mongo:27017")
+mongo_client = AsyncIOMotorClient(MONGODB_URI)
+# If the URI doesn't embed a default DB, select explicitly:
+db = mongo_client.get_database("friend-lite")

273-300: Legacy-only updates diverge from versioned source of truth

update_transcript/update_segment_timing/update_segment_speaker modify legacy transcript only. Callers reading via active version won't see edits.

-        await self.col.update_one(
-            {"audio_uuid": audio_uuid}, {"$set": {"transcript": full_transcript}}
-        )
+        # Update both: active version's segments and legacy field
+        chunk = await self.get_chunk(audio_uuid)
+        active = chunk.get("active_transcript_version")
+        await self.col.update_one(
+            {"audio_uuid": audio_uuid},
+            {
+                "$set": {"transcript": full_transcript},
+                **(
+                    {"$set": {f"transcript_versions.$[v].segments": full_transcript}}
+                    if active else {}
+                ),
+            },
+            array_filters=([{"v.version_id": active}] if active else None),
+        )

Repeat the same pattern for timing/speaker setters targeting transcript_versions.$[v].segments.{index}.


188-201: Missing helper used by TranscriptionManager

TranscriptionManager calls chunk_repo.update_transcription_status(...), but this repo doesn't provide it anymore per your summary. Reintroduce a thin compatibility method.

 class AudioChunksRepository:
@@
     async def store_raw_transcript_data(self, audio_uuid, raw_data, provider):
         """Store raw transcript data from transcription provider."""
         await self.col.update_one(
             {"audio_uuid": audio_uuid},
             {
                 "$set": {
                     "raw_transcript_data": {
                         "provider": provider,
                         "data": raw_data,
                         "stored_at": datetime.now(UTC).isoformat(),
                     }
                 }
             },
         )
+
+    async def update_transcription_status(
+        self,
+        audio_uuid: str,
+        status: str,
+        *,
+        provider: Optional[str] = None,
+        error_message: Optional[str] = None,
+    ) -> bool:
+        """Compatibility: update chunk-level transcription status fields."""
+        update = {
+            "transcription_status": status,
+            "transcription_updated_at": datetime.now(UTC).isoformat(),
+        }
+        if provider:
+            update["transcription_provider"] = provider
+        if error_message:
+            update["transcription_error"] = error_message
+        result = await self.col.update_one({"audio_uuid": audio_uuid}, {"$set": update})
+        return result.modified_count > 0
backends/advanced/src/advanced_omi_backend/transcription.py (1)

206-215: Calls missing repo method: update_transcription_status

This method no longer existed; add it back (see proposed repo diff) or switch to ConversationsRepository.update_transcript_processing_status when a conversation_id exists.

backends/advanced/Docs/architecture.md (1)

963-974: Docs/Routes mismatch: activation endpoints

List shows POST /{conversation_id}/activate-transcript and /activate-memory (no version_id). Router uses /activate-transcript/{version_id} and /activate-memory/{version_id}. Align docs to code or vice versa.

-│   ├── /{conversation_id}/activate-transcript   # Switch transcript version
-│   └── /{conversation_id}/activate-memory       # Switch memory version
+│   ├── /{conversation_id}/activate-transcript/{version_id}   # Switch transcript version
+│   └── /{conversation_id}/activate-memory/{version_id}       # Switch memory version
🧹 Nitpick comments (15)
CLAUDE.md (2)

690-740: Versioned payload examples: field names drift

Examples use "transcript" inside transcript_versions objects, while code uses "segments". Standardize field naming across docs and code to avoid client breakage.


877-955: Markdown lint: add languages on unlabeled code fences and blank lines around tables

A few fenced code blocks lack language; the comparison table needs blank lines. This fixes MD040/MD058.

backends/advanced/src/advanced_omi_backend/database.py (4)

129-179: Active-version writes update legacy list but not version metadata

add_transcript_segment pushes to transcript_versions[active].segments and legacy transcript. Good. Consider also updating version’s status/provider when segments arrive (e.g., to IN_PROGRESS/COMPLETED) for observability.


301-326: Timestamp type inconsistency (datetime vs ISO strings)

You mix datetime objects (cropped_at) with ISO strings elsewhere (created_at, updated_at). Pick one (suggest ISO strings for JSON).

-                    "cropped_at": datetime.now(UTC),
+                    "cropped_at": datetime.now(UTC).isoformat(),

Also standardize updated_at fields across repo.


328-388: Status updates: type consistency and missing completed_at

updated_at uses datetime object while other timestamps are ISO strings; completed_at stored only when status == COMPLETED (good) but updated_at type differs.

-                f"memory_versions.$[version].updated_at": datetime.now(UTC),
-                "memory_processing_status": status,
-                "memory_processing_updated_at": datetime.now(UTC).isoformat(),
+                f"memory_versions.$[version].updated_at": datetime.now(UTC).isoformat(),
+                "memory_processing_status": status,
+                "memory_processing_updated_at": datetime.now(UTC).isoformat(),

544-552: Type hints: implicit Optional[list]

Parameters like segments: list = None violate RUF013. Use Optional[List[T]].

-        segments: list = None,
+        segments: Optional[list] = None,

Apply similarly to raw_data, memories, processing_run_id, provider, etc.

backends/advanced/webui/src/pages/Conversations.tsx (2)

88-94: Dropdown closes on any click, including inside

You’re closing on document click unconditionally; menu actions still work but this can cause flicker. Guard for outside clicks via a ref.contains check.

-  useEffect(() => {
-    const handleClickOutside = () => setOpenDropdown(null)
-    document.addEventListener('click', handleClickOutside)
-    return () => document.removeEventListener('click', handleClickOutside)
-  }, [])
+  const dropdownRef = useRef<HTMLDivElement | null>(null)
+  useEffect(() => {
+    const onDocClick = (e: MouseEvent) => {
+      if (openDropdown && dropdownRef.current && !dropdownRef.current.contains(e.target as Node)) {
+        setOpenDropdown(null)
+      }
+    }
+    document.addEventListener('click', onDocClick)
+    return () => document.removeEventListener('click', onDocClick)
+  }, [openDropdown])

And set ref on the menu container.


381-442: Add a11y labels to icon-only buttons

Provide aria-labels for the More menu and items to improve accessibility.

-                  <button
+                  <button
+                    aria-label="Conversation options"
@@
-                      <button
+                      <button
+                        aria-label="Reprocess transcript"
@@
-                      <button
+                      <button
+                        aria-label="Reprocess memory"
@@
-                      <button
+                      <button
+                        aria-label="Delete conversation"
backends/advanced/src/advanced_omi_backend/transcription.py (2)

408-410: Speaker segment text logged at INFO

These include transcript snippets. Downgrade to DEBUG.

-                                logger.info(f"🔍 DEBUG: Segment {i}: text='{seg.get('text', 'MISSING')}', speaker={seg.get('speaker', 'UNKNOWN')}")
+                                logger.debug("segment %d: speaker=%s", i, seg.get('speaker', 'UNKNOWN'))

451-453: absolute_timestamp origin likely incorrect

Using time.time() + start yields “now + offset”, not the audio’s real wall-clock time. Prefer session_start (from audio chunk timestamp).

-                        "absolute_timestamp": time.time() + segment.get("start", 0.0),
+                        "absolute_timestamp": (
+                            (await self.chunk_repo.get_chunk(self._current_audio_uuid)).get("timestamp", 0)
+                            + segment.get("start", 0.0)
+                        ),

Cache chunk once outside the loop.

backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (1)

113-131: Route shape vs. docs

Routes use /{conversation_id}/activate-transcript/{version_id} and /activate-memory/{version_id}. Ensure docs and frontend align (frontend currently calls without version in path in some places per docs).

backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (4)

75-76: Wrong user identifier property

Use user.user_id consistently; user.id may be None.

-        logger.info(f"Manually closed conversation for client {client_id} by user {user.id}")
+        logger.info(f"Manually closed conversation for client {client_id} by user {user.user_id}")

335-382: Segment updates ignore versioned source

update_transcript_segment only edits legacy transcript. Also update the active version’s segment using arrayFilters.

-        result = await chunks_col.update_one({"audio_uuid": audio_uuid}, {"$set": update_doc})
+        # Update legacy
+        result = await chunks_col.update_one({"audio_uuid": audio_uuid}, {"$set": update_doc})
+        # Update versioned path too
+        chunk = await chunks_col.find_one({"audio_uuid": audio_uuid}, {"active_transcript_version": 1})
+        active = chunk.get("active_transcript_version") if chunk else None
+        if active:
+            version_updates = {}
+            if speaker_id is not None:
+                version_updates[f"transcript_versions.$[v].segments.{segment_index}.speaker"] = speaker_id
+            if start_time is not None:
+                version_updates[f"transcript_versions.$[v].segments.{segment_index}.start"] = start_time
+            if end_time is not None:
+                version_updates[f"transcript_versions.$[v].segments.{segment_index}.end"] = end_time
+            if version_updates:
+                await chunks_col.update_one(
+                    {"audio_uuid": audio_uuid},
+                    {"$set": version_updates},
+                    array_filters=[{"v.version_id": active}],
+                )

568-575: Deterministic config_hash

Hashing str(dict) can vary if dict assembly changes. Use stable JSON with sort_keys.

-        config_hash = hashlib.sha256(str(config_data).encode()).hexdigest()[:16]
+        import json
+        config_hash = hashlib.sha256(json.dumps(config_data, sort_keys=True).encode()).hexdigest()[:16]

Apply to memory reprocess too.


700-703: Use logger.exception for tracebacks

Prefer logger.exception in broad excepts to retain stack traces.

-        logger.error(f"Error starting memory reprocessing: {e}")
+        logger.exception("Error starting memory reprocessing")

Apply similarly to other broad excepts.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 85139d9 and faf00e8.

📒 Files selected for processing (8)
  • CLAUDE.md (7 hunks)
  • backends/advanced/Docs/architecture.md (8 hunks)
  • backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (7 hunks)
  • backends/advanced/src/advanced_omi_backend/database.py (8 hunks)
  • backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (2 hunks)
  • backends/advanced/src/advanced_omi_backend/transcription.py (5 hunks)
  • backends/advanced/webui/src/pages/Conversations.tsx (7 hunks)
  • backends/advanced/webui/src/services/api.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backends/advanced/webui/src/services/api.ts
🧰 Additional context used
🪛 Ruff (0.12.2)
backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py

97-97: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


106-106: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


117-117: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


127-127: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


135-135: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


143-143: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py

462-462: Do not catch blind exception: Exception

(BLE001)


611-611: Do not catch blind exception: Exception

(BLE001)


612-612: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


700-700: Do not catch blind exception: Exception

(BLE001)


701-701: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


735-735: Do not catch blind exception: Exception

(BLE001)


736-736: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


767-767: Do not catch blind exception: Exception

(BLE001)


768-768: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


790-790: Do not catch blind exception: Exception

(BLE001)


791-791: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

backends/advanced/src/advanced_omi_backend/database.py

171-171: f-string without any placeholders

Remove extraneous f prefix

(F541)


368-368: f-string without any placeholders

Remove extraneous f prefix

(F541)


369-369: f-string without any placeholders

Remove extraneous f prefix

(F541)


376-376: f-string without any placeholders

Remove extraneous f prefix

(F541)


547-547: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


548-548: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


549-549: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


550-550: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


579-579: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


580-580: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


691-691: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


692-692: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


726-726: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


752-752: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


753-753: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)

🪛 markdownlint-cli2 (0.17.2)
CLAUDE.md

212-212: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


751-751: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)


945-945: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (11)
backends/advanced/src/advanced_omi_backend/database.py (3)

250-271: LGTM: version-aware reads with legacy fallback

Reading from active version then falling back to transcript is correct and safe.


603-637: Activation also needs legacy fields update (OK) — consider denormalization guard

This sets legacy transcript and speakers from version; good. Consider wrapping both fields under a single $set to avoid partial updates in failures (you already do).


687-710: Processing status on conversation: good, but mirror to audio_chunks if UI still reads it

If any UI still displays transcription_status from audio_chunks, consider syncing here or remove that UI dependency.

backends/advanced/webui/src/pages/Conversations.tsx (1)

565-567: Time display: avoid double conversion for short segments

formatDuration already pads seconds; showing start/end with toFixed(1) is fine. LGTM.

backends/advanced/src/advanced_omi_backend/transcription.py (1)

523-527: Status update path should be version-aware or chunk-level

You set chunk transcription_status; if moving fully to versioned conversations, also call conversations_repo.update_transcript_processing_status(conversation_id, status, provider).

backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (1)

103-111: Query default “active” is a good DX affordance

Accepting transcript_version_id=active by default keeps UI simple. LGTM.

backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (4)

400-418: Helpful debug, but avoid substring/regex queries in normal path

Debug-only alt queries are fine behind DEBUG guard; you already gate most. LGTM.


465-488: Delete audio from both possible roots

Reprocess uses two roots (/app/data/audio_chunks and /app/audio_chunks). Delete checks only one. Try both.
[suggest_minor_issue]

-        if audio_path:
+        if audio_path:
             try:
-                # Construct full path to audio file
-                full_audio_path = Path("/app/audio_chunks") / audio_path
-                if full_audio_path.exists():
-                    full_audio_path.unlink()
+                for root in (Path("/app/audio_chunks"), Path("/app/data/audio_chunks")):
+                    full_audio_path = root / audio_path
+                    if full_audio_path.exists():
+                        full_audio_path.unlink()
+                        deleted_files.append(str(full_audio_path))
-                    deleted_files.append(str(full_audio_path))
                     logger.info(f"Deleted audio file: {full_audio_path}")
             except Exception as e:
                 logger.warning(f"Failed to delete audio file {audio_path}: {e}")

Repeat for cropped path.


597-609: WIP: reprocess queue not wired

Endpoints create versions and runs but don’t enqueue processing. Gate the routes behind a feature flag or return “ACCEPTED” with a clear “not yet processed” note. Ensure UI handles PENDING.


518-526: Ownership checks and error codes look correct

Lean and consistent. LGTM.

backends/advanced/Docs/architecture.md (1)

646-747: Mermaid blocks are great; keep them labeled (they are).

No action; diagram matches narrative.

Comment on lines 283 to 291
provider_name = self.provider.name if self.provider else "unknown"
logger.info(f"🔍 DEBUG: transcript_result type={type(transcript_result)}, content preview: {str(transcript_result)[:200]}")
if self.chunk_repo:
logger.info(f"🔍 DEBUG: About to store raw transcript data for {self._current_audio_uuid}")
await self.chunk_repo.store_raw_transcript_data(
self._current_audio_uuid, transcript_result, provider_name
)
logger.info(f"🔍 DEBUG: Successfully stored raw transcript data for {self._current_audio_uuid}")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Transcripts in logs at INFO may leak PII

Logging raw transcript previews at INFO is a privacy risk. Use DEBUG and redact/shorten.

-            logger.info(f"🔍 DEBUG: transcript_result type={type(transcript_result)}, content preview: {str(transcript_result)[:200]}")
+            logger.debug("transcript_result type=%s, preview=%s", type(transcript_result), str(transcript_result)[:120])
@@
-                logger.info(f"🔍 DEBUG: About to store raw transcript data for {self._current_audio_uuid}")
+                logger.debug("About to store raw transcript data for %s", self._current_audio_uuid)
@@
-                logger.info(f"🔍 DEBUG: Successfully stored raw transcript data for {self._current_audio_uuid}")
+                logger.debug("Stored raw transcript data for %s", self._current_audio_uuid)

Also consider a config flag to disable raw storage in non-dev.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
provider_name = self.provider.name if self.provider else "unknown"
logger.info(f"🔍 DEBUG: transcript_result type={type(transcript_result)}, content preview: {str(transcript_result)[:200]}")
if self.chunk_repo:
logger.info(f"🔍 DEBUG: About to store raw transcript data for {self._current_audio_uuid}")
await self.chunk_repo.store_raw_transcript_data(
self._current_audio_uuid, transcript_result, provider_name
)
logger.info(f"🔍 DEBUG: Successfully stored raw transcript data for {self._current_audio_uuid}")
provider_name = self.provider.name if self.provider else "unknown"
logger.debug("transcript_result type=%s, preview=%s", type(transcript_result), str(transcript_result)[:120])
if self.chunk_repo:
logger.debug("About to store raw transcript data for %s", self._current_audio_uuid)
await self.chunk_repo.store_raw_transcript_data(
self._current_audio_uuid, transcript_result, provider_name
)
logger.debug("Stored raw transcript data for %s", self._current_audio_uuid)
🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/transcription.py around lines 283
to 291, change the INFO-level logs that print raw transcript previews to DEBUG
and redact/shorten the transcript content (e.g., show only a hashed id, length,
or first N characters with ellipsis) to avoid leaking PII; additionally, gate
the call to chunk_repo.store_raw_transcript_data behind a configuration flag
(e.g., STORE_RAW_TRANSCRIPTS or ENV check) so raw transcript storage is disabled
by default in non-development environments and only enabled when explicitly
configured.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant