Skip to content

Cleanup convos#158

Merged
AnkushMalaker merged 14 commits intoSimpleOpenSoftware:mainfrom
Ushadow-io:cleanup_convo
Nov 21, 2025
Merged

Cleanup convos#158
AnkushMalaker merged 14 commits intoSimpleOpenSoftware:mainfrom
Ushadow-io:cleanup_convo

Conversation

@thestumonkey
Copy link
Contributor

@thestumonkey thestumonkey commented Nov 17, 2025

This is a cleanup of conversations

  • Removed deprecated AudioChunksRepository, fully supporting beanie model
  • Fixed some issues around the transcript segment times being incorrect and so not playing properly
  • Fixed issue with delete conversation as was looking at audio_uuid and not conversation ID
  • Added some cleanup code to mark failed conversations as deleted
  • Added a detailed conversation summary
  • Changed timeouts to 24h as was timing out after 1h. This will need a more substantial upgrade in the future

Summary by CodeRabbit

Release Notes

  • New Features

    • Conversation versioning for transcripts and memories
    • Detailed conversation summaries (in addition to standard summaries)
    • Conversation deletion tracking with reasons and timestamps
    • Authenticated audio file serving endpoint
  • Improvements

    • Extended job processing timeouts from 1 to 24 hours
    • Enhanced speaker recognition with better segment handling
    • Improved conversation processing pipeline orchestration
    • Enhanced conversation UI with version management and deletion information

commit 610ed27
Author: Stu Alexandere <thestumonkey@gmail.com>
Date:   Sat Nov 15 09:37:32 2025 +0000

    Removed audiochunks and audio_col

commit dfa2a04
Author: Stu Alexandere <thestumonkey@gmail.com>
Date:   Fri Nov 14 14:29:22 2025 +0000

    use conversation_id on convos instead of audio_uuid
# Conflicts:
#	backends/advanced/webui/src/pages/Conversations.tsx
Added detailed summary, fixed delete conversation, added checking to mark erronous conversations as deleted
we were returning 2 sets of segments and transcripts from the conversation model. changed to draw from the active transcript
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 17, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This PR removes the legacy AudioChunksRepository and migrates audio processing to a Conversation-centric model. It extends job timeouts from 1 hour to 24 hours, refactors post-conversation job orchestration to a new dependency chain (transcription → crop → speaker), adds versioned transcripts/memories with deletion tracking to the Conversation model, introduces speaker-aware summary generation utilities, and updates the web UI to display versioned transcripts and deletion metadata.

Changes

Cohort / File(s) Summary
Database & Model Layer
backends/advanced/src/advanced_omi_backend/database.py, models/conversation.py
Removes AudioChunksRepository class and all related audio-chunk methods; updates get_collections() to exclude chunks_col. Expands Conversation model with versioned transcript/memory fields, deletion tracking (deleted, deletion_reason, deleted_at), and adds properties for backward-compatible access to active transcript/segments; introduces detailed_summary field.
Core Controllers
controllers/audio_controller.py, controllers/conversation_controller.py, controllers/user_controller.py
Migrates audio/conversation operations from AudioChunksRepository/chunks_col to Conversation model. Updates delete_conversation signature from audio_uuid to conversation_id. Replaces chunk-based deletion with Conversation queries. Ownership checks now directly compare against conversation.user_id.
Queue & System Controllers
controllers/queue_controller.py, controllers/system_controller.py
Extends TTLs and timeouts from 1 hour/5 minutes to 24 hours (JOB_RESULT_TTL: 3600→86400, transcription/audio queue timeouts: 300/3600→86400). Refactors start_post_conversation_jobs to enforce transcription → cropping → speaker dependency chain. Deprecates processor manager methods to return empty results with warnings.
Application Config & Client
__init__.py, app_config.py, client.py, client_manager.py, controllers/websocket_controller.py
Removes AudioChunksRepository export from public API. Eliminates audio_chunks MongoDB collection from AppConfig. Strips ac_db_collection_helper parameter from ClientState.__init__. Removes MongoDB repository initialization from WebSocket client creation.
Job Workers
workers/audio_jobs.py, workers/conversation_jobs.py, workers/memory_jobs.py, workers/speaker_jobs.py, workers/transcription_jobs.py
Updates all job signatures to enforce redis_client as keyword-only parameter. Extends max_runtime from 59 minutes to 24 hours for speech detection and streaming jobs. Refactors process_cropping_job to update Conversation model directly instead of chunk repository. Adds conversation deletion path for no-meaningful-speech and audio-file-not-ready scenarios. Introduces generate_title_summary_job with parallel generation of title, short_summary, and detailed_summary.
Utilities
utils/audio_utils.py, utils/conversation_utils.py, memory/compat_service.py
Removes AudioChunksRepository type import and in-function database update logic from _process_audio_cropping_with_relative_timestamps (now handled by caller). Adds segment-aware variants: generate_title, generate_short_summary, generate_detailed_summary with speaker attribution support. Migrates memory enrichment from chunk queries to Conversation model queries.
Routes & API
routers/modules/audio_routes.py, routers/modules/conversation_routes.py
Adds serve_audio_file endpoint with authentication and Conversation-based ownership validation. Removes ClientManager dependency from close_current_conversation. Updates delete_conversation route from /{audio_uuid} to /{conversation_id}.
Frontend UI
webui/src/pages/Conversations.tsx, webui/src/pages/Queue.tsx
Introduces TranscriptVersion interface; replaces flat transcript/segments with transcript_versions array on Conversation. Removes direct transcript, segments, speakers_identified fields; adds detailed_summary, deleted, deletion_reason, deleted_at. Updates UI rendering to handle versioned transcripts per conversation, deleted-conversation warnings, and per-version segment playback. Adds 401 handling to disable auto-refresh in Queue.

Sequence Diagram

sequenceDiagram
    autonumber
    participant Client
    participant Job Queue
    participant Transcription
    participant Audio Cropping
    participant Speaker Recognition
    participant Conversation DB

    Client->>Job Queue: enqueue post-conversation jobs
    
    rect rgb(200, 220, 255)
    Note over Job Queue,Conversation DB: New Orchestration Flow<br/>(transcription → crop → speaker)
    end

    Job Queue->>Transcription: start transcribe_full_audio_job<br/>(conversation_id, audio_uuid)
    activate Transcription
    Transcription->>Conversation DB: update transcript_versions
    Transcription-->>Job Queue: return transcription result
    deactivate Transcription

    Job Queue->>Audio Cropping: start crop job<br/>(depends_on: transcription result)
    activate Audio Cropping
    Audio Cropping->>Audio Cropping: extract_speech_segments
    Audio Cropping->>Conversation DB: update cropped_audio_path<br/>& active_transcript.cropped_audio_path
    Audio Cropping-->>Job Queue: return crop result
    deactivate Audio Cropping

    Job Queue->>Speaker Recognition: start speaker_job<br/>(depends_on: crop result)
    activate Speaker Recognition
    Speaker Recognition->>Conversation DB: update segments with<br/>speaker identification
    Speaker Recognition-->>Job Queue: return speaker result
    deactivate Speaker Recognition

    Job Queue->>Job Queue: enqueue memory & title/summary jobs<br/>(parallel, depend on crop)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • Key areas requiring attention:
    • Job orchestration refactor (queue_controller.py): New transcription → cropping → speaker dependency chain with unconditional transcription path; verify dependency graph is correct and no races/deadlocks occur
    • Conversation model versioning (models/conversation.py): Validator logic for cleaning legacy transcript_versions, deletion-tracking state transitions, and backward-compatible property accessors
    • Timeout & TTL changes (queue_controller.py, workers/*): Rationale for 24-hour extension from 1 hour; confirm implications for job lifecycle and resource cleanup
    • Frontend Conversation interface (Conversations.tsx): Significant model restructuring with versioned transcripts; verify all state management, playback keying, and deletion UI flows work correctly
    • Title/summary parallel generation (conversation_jobs.py): Ensure asyncio.gather correctly propagates errors and that all three summaries are persisted atomically

Possibly related PRs

  • Better audio processing #111: Modifies _process_audio_cropping_with_relative_timestamps signature and AudioChunksRepository parameter handling—directly related to the same audio utility refactoring.
  • Queue combo #121: Comprehensive migration from AudioChunksRepository to Beanie-based Conversation/AudioSession models with RQ-based queuing—overlaps significantly across models, database, controllers, and workers.
  • feat: Adds "closing conversation", beefs up and fixes Queue management page  #144: Same core refactor removing AudioChunksRepository and migrating audio cropping, storage, and job flows to Conversation-centric model across identical files.

Poem

🐰 Chunks of data hop away,
Conversations now hold sway,
Versioned transcripts, neat and clean,
Twenty-four hours for jobs serene,
Speaker segments sing so true—
A brand new model, fresh and new!

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Cleanup convos' is vague and does not clearly convey the scope and significance of the changes, which involve major architectural refactoring from AudioChunksRepository to Beanie model. Consider a more descriptive title such as 'Migrate from AudioChunksRepository to Beanie model and fix conversation deletion' or 'Refactor audio data model and fix conversation management' to better reflect the primary changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 97.96% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@thestumonkey
Copy link
Contributor Author

@coderabbitai

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 17, 2025

Tip

For best results, initiate chat on the files or code changes.

@thestumonkey: Hello! How can I assist you with this PR? I'm here to help with code reviews, answer questions, generate tests, or provide any other assistance you need.

@thestumonkey
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 17, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
backends/advanced/src/advanced_omi_backend/client_manager.py (1)

107-121: Fix broken create_client signature (param order mismatch with callsites).

create_client still takes ac_repository but it's no longer used; websocket_controller passes 4 args, which currently bind CHUNK_DIR→ac_repository, user_id→chunk_dir, user_email→user_id. This will create invalid ClientState state at runtime.

Proposed fix: drop ac_repository from the signature and docs; keep callsites unchanged or use keyword args for safety.

-    def create_client(self, client_id: str, ac_repository, chunk_dir, user_id: str, user_email: Optional[str] = None) -> "ClientState":
+    def create_client(
+        self,
+        client_id: str,
+        chunk_dir,
+        user_id: str,
+        user_email: Optional[str] = None,
+    ) -> "ClientState":
@@
-            ac_repository: Audio chunks repository
             chunk_dir: Directory for audio chunks
             user_id: User ID who owns this client
             user_email: Optional user email
@@
-        client_state = ClientState(client_id, chunk_dir, user_id, user_email)
+        client_state = ClientState(client_id, chunk_dir, user_id, user_email)

Optionally, annotate chunk_dir as Path | str.

Also applies to: 134-141

backends/advanced/src/advanced_omi_backend/models/conversation.py (1)

295-311: Factory still sets removed fields (transcript/segments/memories); also uses local time.

The model removed legacy fields, but create_conversation still populates transcript, segments, memories, memory_count. Depending on Pydantic/Beanie config, these extras may be ignored or raise, and you lose initial transcript. Also created_at uses local time.

Fix: drop removed keys, use UTC, and when transcript/segments are provided, create an initial TranscriptVersion and set it active.

@@ def create_conversation(
-    # Build the conversation data
-    conv_data = {
-        "audio_uuid": audio_uuid,
-        "user_id": user_id,
-        "client_id": client_id,
-        "created_at": datetime.now(),
-        "title": title,
-        "summary": summary,
-        "transcript": transcript or "",
-        "segments": segments or [],
-        "transcript_versions": [],
-        "active_transcript_version": None,
-        "memory_versions": [],
-        "active_memory_version": None,
-        "memories": [],
-        "memory_count": 0
-    }
+    # Build the conversation data (no legacy fields; UTC timestamps)
+    conv_data = {
+        "audio_uuid": audio_uuid,
+        "user_id": user_id,
+        "client_id": client_id,
+        "created_at": datetime.utcnow(),
+        "title": title,
+        "summary": summary,
+        "transcript_versions": [],
+        "active_transcript_version": None,
+        "memory_versions": [],
+        "active_memory_version": None,
+    }
@@
-    return Conversation(**conv_data)
+    conv = Conversation(**conv_data)
+    # Seed an initial transcript version if provided
+    if transcript or segments:
+        conv.add_transcript_version(
+            version_id=str(uuid.uuid4()),
+            transcript=transcript or "",
+            segments=segments or [],
+            provider=Conversation.TranscriptProvider.UNKNOWN,
+            model=None,
+            processing_time_seconds=None,
+            metadata={"seeded": True},
+            set_as_active=True,
+        )
+    return conv
backends/advanced/src/advanced_omi_backend/workers/memory_jobs.py (1)

129-143: Writes to removed Conversation.memories field; update to versioned storage.

Conversation no longer has a memories array. Persist results as a MemoryVersion and carry memory_ids in metadata.

-            if conversation_model:
-                memory_refs = [
-                    {"memory_id": mid, "created_at": datetime.now(UTC).isoformat(), "status": "created"}
-                    for mid in created_memory_ids
-                ]
-                conversation_model.memories.extend(memory_refs)
-                await conversation_model.save()
+            if conversation_model:
+                version_id = f"mem-{conversation_id[:8]}-{int(time.time())}"
+                conversation_model.add_memory_version(
+                    version_id=version_id,
+                    memory_count=len(created_memory_ids),
+                    transcript_version_id=conversation_model.active_transcript_version or "",
+                    provider=Conversation.MemoryProvider.FRIEND_LITE,
+                    model=None,
+                    processing_time_seconds=processing_time,
+                    metadata={"memory_ids": created_memory_ids},
+                    set_as_active=True,
+                )
+                await conversation_model.save()
backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py (1)

116-125: Update callers for the new keyword‑only redis_client and adjusted signature

Making redis_client keyword‑only for both jobs and bumping max_runtime to ~24h is reasonable and lines up with the longer streaming timeouts.

However, with the new transcribe_full_audio_job signature:

async def transcribe_full_audio_job(
    conversation_id: str,
    audio_uuid: str,
    audio_path: str,
    version_id: str,
    trigger: str = "reprocess",
    *,
    redis_client=None,
)

the reprocess_transcript caller in conversation_controller.py still passes an extra positional argument (str(user.user_id)) before "reprocess", which will now (a) misassign trigger and (b) raise TypeError: ... takes 5 positional arguments but 6 were given when the job runs.

Suggested fix in reprocess_transcript (file: conversation_controller.py), adjusting only the enqueue call:

-        transcript_job = transcription_queue.enqueue(
-            transcribe_full_audio_job,
-            conversation_id,
-            audio_uuid,
-            str(full_audio_path),
-            version_id,
-            str(user.user_id),
-            "reprocess",
+        transcript_job = transcription_queue.enqueue(
+            transcribe_full_audio_job,
+            conversation_id,
+            audio_uuid,
+            str(full_audio_path),
+            version_id,
+            "reprocess",
             job_timeout=600,
             result_ttl=JOB_RESULT_TTL,
             job_id=f"reprocess_{conversation_id[:8]}",
             description=f"Transcribe audio for {conversation_id[:8]}",
             meta={'audio_uuid': audio_uuid, 'conversation_id': conversation_id}
         )

This preserves the new signature and avoids runtime failures when reprocessing transcripts.

Also applies to: 335-341, 374-375

🧹 Nitpick comments (17)
backends/advanced/webui/src/pages/Queue.tsx (1)

292-300: Good 401 auto‑refresh guard; tighten error typing.

Keep the 401 stop. Prefer unknown + axios guard to avoid any, and optionally centralize auth handling.

-    } catch (error: any) {
+    } catch (error: unknown) {
       console.error('❌ Error fetching dashboard data:', error);
-      // If it's a 401 error, stop auto-refresh to prevent repeated failed requests
-      if (error?.response?.status === 401) {
+      // If it's a 401 error, stop auto-refresh to prevent repeated failed requests
+      if ((error as any)?.response?.status === 401) {
         console.warn('🔐 Authentication error detected - disabling auto-refresh');
         setAutoRefreshEnabled(false);
       }
backends/advanced/src/advanced_omi_backend/models/conversation.py (2)

194-204: Use UTC for version timestamps.

Use datetime.utcnow() (or timezone-aware) for created_at to keep consistency with created_at field.

-            created_at=datetime.now(),
+            created_at=datetime.utcnow(),
@@
-            created_at=datetime.now(),
+            created_at=datetime.utcnow(),

Also applies to: 224-234


71-80: Consider indexing audio_uuid and persisting status.

  • audio_uuid is used for lookups; ensure there’s an explicit index (it’s Indexed already; adding to Settings.indexes can help compound patterns).
  • ConversationStatus enum exists but isn’t persisted; adding a status: ConversationStatus field improves introspection.

Also applies to: 258-266

backends/advanced/src/advanced_omi_backend/__init__.py (1)

5-5: Public API change: all now empty.

If external code imported AudioChunksRepository from this package, this is a breaking change. Consider a deprecation shim or release notes.

backends/advanced/src/advanced_omi_backend/workers/memory_jobs.py (1)

98-116: Avoid duplicate user fetch.

You fetch user twice; reuse the first result.

backends/advanced/src/advanced_omi_backend/memory/compat_service.py (1)

285-291: Use logging.exception to capture traceback on import failure.

Improve observability on model import errors.

-        except ImportError:
-            memory_logger.error("Cannot import Conversation model")
+        except ImportError:
+            memory_logger.exception("Cannot import Conversation model")

Based on static analysis hints.

backends/advanced/src/advanced_omi_backend/routers/modules/audio_routes.py (1)

41-93: Tighten filename handling and move inline import to module scope

The ownership check and 404/403 behavior look good, but a couple of adjustments would make this endpoint more robust and consistent:

  • Import Conversation at the top of the module rather than inside the function, unless you’re explicitly working around a circular import. If a lazy import is required, consider adding a brief comment explaining why.
  • Derive audio_uuid using Path and explicit prefix handling instead of chained replace, and be defensive against unexpected extensions. This avoids accidentally stripping "cropped_" or ".wav" from the middle of a filename.

For example:

-from advanced_omi_backend.models.conversation import Conversation
-
-# Remove "cropped_" prefix if present
-audio_uuid = filename.replace("cropped_", "").replace(".wav", "")
+from advanced_omi_backend.models.conversation import Conversation
+from pathlib import Path
+
+# Remove "cropped_" prefix if present, and ignore extension
+name = Path(filename).stem
+if name.startswith("cropped_"):
+    name = name[len("cropped_"):]
+audio_uuid = name

If there’s any chance filename could include subdirectories in the future, also assert that file_path.resolve() stays under audio_dir.resolve() before serving.
As per coding guidelines

backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (1)

171-263: Deletion flow looks correct; consider shared audio path helper and richer error logging

The new conversation‑centric deletion flow (ownership check, Beanie Conversation delete, legacy AudioFile cleanup, and disk file removal) looks correct and fixes the prior audio_uuid‑based behavior.

Two improvements to consider:

  • Reuse a central audio directory helper: you’re hard‑coding Path("/app/audio_chunks") here while the new audio route uses get_audio_chunk_dir(). Using the same helper in both places would avoid config drift if the base directory changes.

  • Improve exception logging: in the broad except Exception block, prefer logger.exception(...) so stack traces are captured, and consider narrowing the exception type if you can distinguish expected I/O issues from truly unexpected failures.

Example tweak:

-        # Delete associated audio files from disk
+        # Delete associated audio files from disk
         deleted_files = []
         if audio_path:
             try:
-                # Construct full path to audio file
-                full_audio_path = Path("/app/audio_chunks") / audio_path
+                from advanced_omi_backend.app_config import get_audio_chunk_dir
+                full_audio_path = get_audio_chunk_dir() / audio_path
@@
-    except Exception as e:
-        logger.error(f"Error deleting conversation {conversation_id}: {e}")
+    except Exception as e:
+        logger.exception(f"Error deleting conversation {conversation_id}: {e}")

Same pattern can be applied to cropped_audio_path.
As per coding guidelines

backends/advanced/src/advanced_omi_backend/controllers/queue_controller.py (2)

42-48: 24‑hour TTLs and timeouts are consistent but double‑check Redis/worker capacity

Bumping JOB_RESULT_TTL and the transcription/audio queue default_timeout to 24 hours, along with job_timeout=86400 in start_streaming_jobs, lines up nicely with the max_runtime change in stream_speech_detection_job and the goal of all‑day sessions.

This does mean long‑lived jobs and results will occupy Redis memory and worker slots for much longer than before. Make sure:

  • Worker pool sizing assumes a few 24‑hour jobs per user/session.
  • Redis memory/eviction policy is configured with these larger TTLs in mind.

Functionally the changes look correct; this is mainly operational risk to validate in staging/production.

Also applies to: 268-303


324-452: Clarify post_transcription semantics and dependency chain in start_post_conversation_jobs

The new post‑conversation chain (transcription → crop → speaker → memory/title) looks coherent and should help fix the timestamp issues you mentioned. A couple of cleanups will reduce confusion:

  • post_transcription is no longer used to gate whether the batch transcription job runs; the code always enqueues it, but the docstring still describes step 1 as “[Optional]”. Either remove the parameter and update the docstring to say transcription is always run, or reintroduce the conditional and wire crop_depends_on to depends_on_job when skipping transcription.
  • Given transcription is now always enqueued, the return value can simply use transcription_job.id rather than transcription_job.id if transcription_job else None.

These are small but will make the API behavior clearer to future callers.

backends/advanced/src/advanced_omi_backend/controllers/audio_controller.py (1)

196-205: Conversation-based cropping info is good; consider dropping hasattr guards

Switching get_cropped_audio_info to look up by Conversation.audio_uuid and check ownership with conversation.user_id is a clear improvement and correctly uses 404 for unauthorized access.

The hasattr checks on speech_segments, cropped_duration, and cropped_at suggest uncertainty about the Conversation schema. If these fields are part of the Beanie model with sensible defaults, you can return them directly (possibly None/[]) and avoid attribute introspection, which keeps the code aligned with your data model instead of defensive checks.

Also applies to: 207-215, 219-224

backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py (1)

103-112: Empty speaker segments filtering is good; consider aligning metadata with filtered segments

The new logic that strips seg["text"], skips empty segments, and logs empty_segment_count before building Conversation.SpeakerSegment instances is a solid improvement and should remove junk/blank segments from the transcript version.

Right now, identified_speakers and total_segments in transcript_version.metadata["speaker_recognition"] are still computed from the original speaker_segments. For clearer downstream semantics, consider deriving both the unique speakers and total_segments from updated_segments so metadata describes exactly what was persisted to the transcript version.

Also applies to: 224-248, 250-270, 276-284

backends/advanced/src/advanced_omi_backend/utils/conversation_utils.py (1)

137-188: Segment-aware title/summary utilities look solid; minor reuse/type-shape considerations

The new generate_title, generate_short_summary, and generate_detailed_summary functions nicely reuse segment text (and speakers where present) to build more informative prompts, while the aliases (generate_summary, generate_title_with_speakers, generate_summary_with_speakers) preserve the old API surface. Fallback behavior for both short and detailed summaries is reasonable (short text → “No content”; errors → truncated/cleaned transcript).

Two small follow-ups you might consider:

  • The “format segments into conversation_text + include_speakers flag” logic is duplicated between the short and detailed summary paths; extracting a helper would reduce repetition and keep behavior in sync.
  • All segment-based functions assume segments are dict-like (segment.get(...)). If you ever plan to pass Conversation.SpeakerSegment model instances directly, it may be worth either accepting both dicts and models (attribute + dict access) or normalizing them before calling these helpers.

Also applies to: 190-260, 262-343, 345-369

backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py (1)

466-471: Title/summary job changes look solid; address unused redis_client and redundant import

The new generate_title_summary_job flow looks good overall:

  • Uses transcript + segments from the conversation’s active transcript version.
  • Guards against missing transcript/segments with a clear early return.
  • Converts segment objects into plain dicts for the utility functions.
  • Runs generate_title, generate_short_summary, and generate_detailed_summary in parallel with asyncio.gather for efficiency.
  • Persists title, summary, and detailed_summary back to the conversation and exposes detailed_summary_length in job metadata.

A few small cleanups:

  1. Unused redis_client argument (Ruff ARG001)

    • redis_client is accepted but never referenced in this function. If it’s here purely for a consistent async_job signature, consider:
      • Renaming to _redis_client to signal intentional non‑use, or
      • Using it for something trivial (e.g., logging Redis connection info) to silence the linter.
    • Otherwise Ruff’s ARG001 will persist.
  2. Redundant inner asyncio import

    • import asyncio at line 7 already exists; the additional import asyncio inside the function (line 535) is redundant. You can safely remove the inner import and keep the top‑level one:
  •    # Generate all three summaries in parallel for efficiency
    
  •    import asyncio
    
  •    # Generate all three summaries in parallel for efficiency
       title, short_summary, detailed_summary = await asyncio.gather(
    
    
    
    
  1. Minor robustness suggestion
    • You already guard detailed_summary_length with a conditional; if there’s any chance the utility functions could return non‑string values, you might want to coerce to str before using len and persisting, but if they are guaranteed to return strings by contract, this is fine as‑is.

Also applies to: 505-515, 517-540, 543-548, 574-575, 587-588

backends/advanced/webui/src/pages/Conversations.tsx (3)

6-29: Conversation/TranscriptVersion types align; consider tightening memory_versions typing

The extended Conversation and new TranscriptVersion interfaces match the backend description (versioned transcripts, metadata, and deletion fields) and should give the UI enough structure for the new flows.

Only small nit: memory_versions?: any[] could be narrowed (e.g., a specific MemoryVersion interface) once the backend shape stabilizes. This would improve type safety around the version header and debug info.

Also applies to: 31-53


275-333: Segment playback logic is correct; update state comment for clarity

The per‑segment playback logic looks good:

  • One HTMLAudioElement per conversation, stored in audioRefs.current[conversationId].
  • playingSegment encodes {conversationKey}-{segmentIndex}, matching both how it’s set in handleSegmentPlayPause and how isPlaying is computed in the transcript render.
  • You correctly stop any currently playing segment (pause audio + clear timer) before starting a new one, and you stop playback at segment.end using a timeout based on (end - start) * 1000.

Minor nit: the comment near playingSegment still says the format is "audioUuid-segmentIndex", but you’re now using a conversation‑scoped key (conversation_id || audio_uuid). Updating that comment will avoid confusion later.

Also applies to: 676-687, 695-713


503-504: Guard against missing created_at to avoid “Invalid Date” display

formatDate(conversation.created_at || '') will pass an empty string when created_at is missing. That ends up as new Date('Z').toLocaleString() in formatDate, which can render as “Invalid Date” in some browsers.

Consider falling back earlier, for example:

- <span>{formatDate(conversation.created_at || '')}</span>
+ <span>
+   {conversation.created_at ? formatDate(conversation.created_at) : 'Unknown date'}
+ </span>

This keeps the UI clean when older or partial conversations don’t have a timestamp.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14f5242 and 59a613b.

📒 Files selected for processing (24)
  • backends/advanced/src/advanced_omi_backend/__init__.py (1 hunks)
  • backends/advanced/src/advanced_omi_backend/app_config.py (0 hunks)
  • backends/advanced/src/advanced_omi_backend/client.py (0 hunks)
  • backends/advanced/src/advanced_omi_backend/client_manager.py (1 hunks)
  • backends/advanced/src/advanced_omi_backend/controllers/audio_controller.py (7 hunks)
  • backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (5 hunks)
  • backends/advanced/src/advanced_omi_backend/controllers/queue_controller.py (7 hunks)
  • backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (1 hunks)
  • backends/advanced/src/advanced_omi_backend/controllers/user_controller.py (2 hunks)
  • backends/advanced/src/advanced_omi_backend/controllers/websocket_controller.py (1 hunks)
  • backends/advanced/src/advanced_omi_backend/database.py (1 hunks)
  • backends/advanced/src/advanced_omi_backend/memory/compat_service.py (2 hunks)
  • backends/advanced/src/advanced_omi_backend/models/conversation.py (5 hunks)
  • backends/advanced/src/advanced_omi_backend/routers/modules/audio_routes.py (2 hunks)
  • backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (2 hunks)
  • backends/advanced/src/advanced_omi_backend/utils/audio_utils.py (1 hunks)
  • backends/advanced/src/advanced_omi_backend/utils/conversation_utils.py (4 hunks)
  • backends/advanced/src/advanced_omi_backend/workers/audio_jobs.py (8 hunks)
  • backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py (11 hunks)
  • backends/advanced/src/advanced_omi_backend/workers/memory_jobs.py (2 hunks)
  • backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py (3 hunks)
  • backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py (3 hunks)
  • backends/advanced/webui/src/pages/Conversations.tsx (16 hunks)
  • backends/advanced/webui/src/pages/Queue.tsx (1 hunks)
💤 Files with no reviewable changes (2)
  • backends/advanced/src/advanced_omi_backend/app_config.py
  • backends/advanced/src/advanced_omi_backend/client.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Format Python code with Black using a 100-character line length
Use isort to organize Python imports
Place all imports at the top of the Python file after the docstring; never import modules in the middle of functions or files
Use lazy imports only when absolutely necessary to resolve circular import issues
Group Python imports by: standard library, third-party, then local imports
Always raise errors rather than silently ignoring failures; use explicit exceptions
Understand data structures instead of adding defensive hasattr checks; prefer correct models/parsing over ad-hoc guards

Files:

  • backends/advanced/src/advanced_omi_backend/controllers/user_controller.py
  • backends/advanced/src/advanced_omi_backend/workers/memory_jobs.py
  • backends/advanced/src/advanced_omi_backend/routers/modules/audio_routes.py
  • backends/advanced/src/advanced_omi_backend/controllers/queue_controller.py
  • backends/advanced/src/advanced_omi_backend/database.py
  • backends/advanced/src/advanced_omi_backend/controllers/audio_controller.py
  • backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
  • backends/advanced/src/advanced_omi_backend/__init__.py
  • backends/advanced/src/advanced_omi_backend/memory/compat_service.py
  • backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py
  • backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py
  • backends/advanced/src/advanced_omi_backend/client_manager.py
  • backends/advanced/src/advanced_omi_backend/utils/audio_utils.py
  • backends/advanced/src/advanced_omi_backend/utils/conversation_utils.py
  • backends/advanced/src/advanced_omi_backend/models/conversation.py
  • backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py
  • backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py
  • backends/advanced/src/advanced_omi_backend/controllers/websocket_controller.py
  • backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py
  • backends/advanced/src/advanced_omi_backend/workers/audio_jobs.py
🪛 Ruff (0.14.5)
backends/advanced/src/advanced_omi_backend/routers/modules/audio_routes.py

44-44: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

backends/advanced/src/advanced_omi_backend/memory/compat_service.py

288-288: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py

111-111: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

backends/advanced/src/advanced_omi_backend/utils/conversation_utils.py

337-337: Do not catch blind exception: Exception

(BLE001)

backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py

108-108: Unnecessary key check before dictionary access

Replace with dict.get

(RUF019)


258-258: Do not catch blind exception: Exception

(BLE001)


259-259: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py

282-282: f-string without any placeholders

Remove extraneous f prefix

(F541)


293-293: f-string without any placeholders

Remove extraneous f prefix

(F541)


470-470: Unused function argument: redis_client

(ARG001)

🔇 Additional comments (21)
backends/advanced/src/advanced_omi_backend/models/conversation.py (2)

84-88: Deletion metadata addition looks good.

Fields deleted, deletion_reason, deleted_at align with cleanup objectives. No issues.


91-93: Detailed summary field is a useful addition.

Matches UI needs for a richer summary.

backends/advanced/src/advanced_omi_backend/workers/memory_jobs.py (1)

21-26: Signature/enqueue alignment looks good.

Kw‑only redis_client enforces correct enqueues; enqueue passes only conversation_id.

Also applies to: 210-217

backends/advanced/src/advanced_omi_backend/memory/compat_service.py (1)

335-343: Segment text access: guard against empty/None.

You already check transcript_segments; also guard falsy segment.text when joining (safe).

backends/advanced/src/advanced_omi_backend/controllers/user_controller.py (1)

178-181: Current implementation is correct; delete_many() is optional.

The code correctly uses await Conversation.find(...).delete(), which returns a pymongo DeleteResult exposing .deleted_count. Both .delete() and .delete_many() are equivalent when awaited. Using delete_many() would be slightly more explicit about bulk deletion but is not necessary.

backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (2)

99-119: Conversation payload shaping for detail view looks solid

Using model_dump(mode="json", exclude={"id"}) and trimming heavy metadata.words plus redundant speaker_recognition counters is a clean way to keep the response lean while preserving useful metadata. Adding has_memory as a computed field is also helpful for the UI.


140-160: List view conversation projection is appropriate

Excluding transcript_versions/memory_versions for the list endpoint and computing segment_count, has_memory, and version counts off the Beanie model keeps the list response lightweight without losing key summary information.

backends/advanced/src/advanced_omi_backend/database.py (1)

30-44: Database module simplification looks good

Documenting that conversations are now Beanie‑managed and limiting get_collections() to users_col keeps this module focused and aligns with the removal of the legacy AudioChunksRepository. No issues here.

backends/advanced/src/advanced_omi_backend/workers/audio_jobs.py (1)

155-157: 24h runtime/TTL adjustments look consistent

audio_streaming_persistence_job now uses a kw‑only redis_client, a max runtime of 86340 seconds, and a Redis key TTL of 86400 seconds for audio file paths. This keeps the job’s loop within the 24h window with a small grace period and aligns the Redis expiry with the intended retention.

Also applies to: 199-199, 318-318

backends/advanced/src/advanced_omi_backend/controllers/audio_controller.py (1)

7-7: Using Conversation and audio_path during upload is consistent with the new model

Creating the Conversation up front and setting conversation.audio_path = wav_filename before insert() ties uploads, playback, and later cropping/reprocessing together cleanly around the Conversation model. The updated module docstring and Conversation import match that direction.

Also applies to: 26-27, 110-121

backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (2)

21-28: Close-conversation route now matches controller API

The updated docstring and call to conversation_controller.close_current_conversation(client_id, current_user) reflect the simplified controller signature and keep the route behavior clear.


109-114: Delete by conversation_id instead of audio_uuid is the right fix

Switching the DELETE route to /{conversation_id} and forwarding conversation_id to conversation_controller.delete_conversation aligns the API with Conversation-centric deletion and fixes the earlier mismatch described in the PR.

backends/advanced/src/advanced_omi_backend/utils/audio_utils.py (1)

266-275: Decoupling cropping from the old repository is clean and backward-compatible

Renaming the last parameter to _deprecated_chunk_repo and updating the docstring to say persistence is handled by the caller removes the tight coupling to AudioChunksRepository while preserving the call signature for existing callers. The internal behavior remains focused on pure cropping, which fits the new Conversation-centric flow.

backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py (1)

19-25: Kw-only redis_client matches other workers

Adding *, redis_client=None to check_enrolled_speakers_job keeps the signature consistent with other @async_job(redis=True, beanie=True) tasks and makes the Redis dependency explicit without changing behavior.

backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py (4)

24-26: Kw‑only redis_client in job signature looks good

Making redis_client keyword‑only and documenting it explicitly keeps the async_job API clearer and prevents positional call mistakes. As long as the @async_job decorator injects it by keyword (which seems to be the convention in this PR), this is a safe change.

Also applies to: 37-38


68-75: Early metadata update on current_job is correct and helpful

Storing conversation_id on current_job.meta right after creation (with a defensive meta init) is a good improvement for queue grouping and debugging. This block looks correct and side‑effect free.


115-121: TTL and runtime extensions are consistent with 24‑hour sessions

Updating the Redis TTLs to 86400s for conversation:session:*, conversation:current:*, and conversation:last_speech:* plus extending max_runtime to ~3 hours matches the PR’s 24‑hour timeout goal and still leaves headroom vs. the 1‑hour audio:session TTL later in the function. No functional issues here.

Also applies to: 128-129, 191-192


244-252: Progress logging and loop cadence look reasonable

The new progress log (results/char/segment counts) and 1‑second sleep give good observability without being too noisy or slow. This section is fine as written.

backends/advanced/webui/src/pages/Conversations.tsx (3)

200-225: Delete flow and deleted‑conversation UI look correct

  • handleDeleteConversation tracks in‑flight deletions via a Set, disables the delete action while pending, and refreshes conversations on success. That prevents duplicate requests and keeps state in sync.
  • The dropdown button correctly gates delete on conversation.conversation_id and shows an “ID missing” hint for legacy conversations.
  • The deleted‑conversation warning block (reason mapping and deleted_at display) gives users a clear explanation for failure states (no_meaningful_speech, audio_file_not_ready, or fallback).

This all looks sound and matches the backend’s new soft‑delete semantics.

Also applies to: 566-578, 412-441


245-273: Transcript expansion and lazy loading of versions are wired correctly

  • toggleTranscriptExpansion:
    • Uses conversation.conversation_id as the expansion key and exits early if missing, which is appropriate for old-format conversations.
    • Checks whether the active transcript already has segments before round‑tripping to getById, avoiding unnecessary API calls.
    • On fetch, it merges transcript_versions into just that conversation in state, which is good for scroll stability.
  • The transcript header uses the active version’s segments.length with a fallback to segment_count from the list endpoint and toggles chevrons based on expandedTranscripts.

Behaviorally this looks correct and should be efficient with many conversations on the page.

Also applies to: 626-645, 648-652


489-497: Detailed summary and speaker metadata rendering are well‑integrated

  • The detailed summary block is clearly separated from the short summary, uses whitespace-pre-wrap, and visually highlights that it’s a richer summary.
  • The “Identified Speakers” section correctly pulls from activeTranscript.metadata.speaker_recognition.identified_speakers and renders pills; this lines up with the backend’s new speaker metadata.
  • Debug info now surfaces active version IDs and counts for transcripts/memories, which is useful for diagnosing version issues.

These UI additions are consistent with the new data model and look good.

Also applies to: 745-767, 778-781

Copy link
Collaborator

@AnkushMalaker AnkushMalaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I like the detailed summary, although it doesn't seem to have a limit, in which case when the AI hallucinates a 2 page summary, it will fill up the page. Could you add an issue just in case, or keep this in mind. This PR is good though, so I'm merging this.

@AnkushMalaker AnkushMalaker merged commit c75e9eb into SimpleOpenSoftware:main Nov 21, 2025
1 of 2 checks passed
thestumonkey pushed a commit to Ushadow-io/chronicle that referenced this pull request Nov 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants