Conversation
Also beefed up the queue maangement page
- Introduced a comprehensive 'Getting Started' guide in `Docs/getting-started.md` to assist new users with setup and usage of the Friend-Lite backend. - Updated initialization script references in `Docs/init-system.md` to reflect the new `wizard.py` instead of `init.py`. - Added new local OMI Bluetooth connection scripts in `extras/local-omi-bt/` for improved audio streaming capabilities. - Removed the deprecated `connect-laptop.py` script and replaced it with `connect-omi.py` for better functionality. - Updated dependencies in `requirements.txt` to include new libraries for enhanced audio processing and backend communication. - Added a new `start.sh` script for easier execution of the local OMI connection process.
|
@coderabbitai review |
WalkthroughThis PR executes a major architectural refactoring: replacing nginx with Caddyfile for reverse proxying, removing the ConversationManager class, simplifying job signatures to derive user context from the database, introducing per-conversation audio streaming with persistence, adding inactivity timeout logic, and expanding infrastructure support (BLE OMI device integration, speaker recognition setup). Setup tooling migrates to python-dotenv and wizard.py-centric configuration. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant WebSocket as WebSocket<br/>Controller
participant Speech as Speech Detection<br/>Job
participant Open as Open Conversation<br/>Job
participant Transcribe as Transcribe<br/>Job
participant Recognize as Recognize<br/>Speakers Job
participant Crop as Audio Cropping<br/>Job
participant Memory as Memory<br/>Job
participant Persist as Audio Streaming<br/>Persistence Job
Client->>WebSocket: Connect (OMI/PCM)
WebSocket->>Speech: Enqueue stream_speech_detection_job<br/>(session_id, user_id, client_id)
Note over Speech: Listen for meaningful speech
Speech->>Speech: Detect speech via aggregator
alt Speech Detected
Speech->>Open: Enqueue open_conversation_job<br/>(session_id, speech_job_id)
Open->>Transcribe: Create conversation, enqueue transcribe
Open->>Persist: Start per-conversation audio<br/>file rotation
Transcribe->>Recognize: Enqueue speaker recognition<br/>(conversation_id, audio_path)
Recognize->>Crop: Enqueue audio cropping<br/>(conversation_id, audio_path)
Crop->>Memory: Enqueue memory extraction<br/>(conversation_id) — derives user_id, user_email<br/>from Conversation model
Memory->>Memory: Extract & store memories
else No Speech / Timeout
Speech->>Speech: Log no_speech_detected or<br/>inactivity_timeout
end
Persist->>Persist: Rotate file per conversation<br/>Mark session complete on disconnect
WebSocket->>WebSocket: cleanup_client_state:<br/>cancel speech job, mark sessions<br/>complete in Redis
sequenceDiagram
participant Admin as Admin/User
participant WebUI
participant API as Queue Routes<br/>API
participant Dashboard as Dashboard<br/>Service
WebUI->>API: GET /queue/dashboard?expanded_sessions=...
par Dashboard Aggregation
API->>Dashboard: Fetch queued jobs (all queues)
API->>Dashboard: Fetch processing jobs
API->>Dashboard: Fetch completed jobs
API->>Dashboard: Fetch failed jobs
API->>Dashboard: Fetch session details<br/>(with conversation_count)
API->>Dashboard: Fetch jobs per expanded session
end
Dashboard-->>API: Aggregated job stats, queue stats,<br/>streaming status, session details
API-->>WebUI: Consolidated dashboard data
WebUI-->>Admin: Render job queue status,<br/>session overview, stats
alt User Cancels Job
Admin->>WebUI: Click cancel on job
WebUI->>API: DELETE /queue/jobs/{job_id}
API->>API: Cancel queued/started job<br/>or delete completed/failed
API-->>WebUI: Action result
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Reasoning: This PR exhibits substantial heterogeneity across multiple domains. Key contributors to complexity:
Large file count (~60 files) mitigated by repetitive patterns in import updates and logging level changes, which can be reviewed systematically. Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (8)
backends/advanced/src/advanced_omi_backend/controllers/websocket_controller.py (1)
101-106: Motor DB selection bug (will raise at runtime).
get_default_database()takes no args and requires a DB in the URI. Useget_database("friend-lite")orclient["friend-lite"].- MONGODB_URI = os.getenv("MONGODB_URI", "mongodb://mongo:27017") - mongo_client = AsyncIOMotorClient(MONGODB_URI) - db = mongo_client.get_default_database("friend-lite") + MONGODB_URI = os.getenv("MONGODB_URI", "mongodb://mongo:27017") + mongo_client = AsyncIOMotorClient(MONGODB_URI) + db = mongo_client.get_database("friend-lite")backends/advanced/webui/src/services/api.ts (1)
208-216: Avoid sending “Bearer null” when no token.Send Authorization only if a token exists.
- return fetch(`${BACKEND_URL}/api/chat/send`, { + const authToken = localStorage.getItem('token') + return fetch(`${BACKEND_URL}/api/chat/send`, { method: 'POST', headers: { 'Content-Type': 'application/json', - 'Authorization': `Bearer ${localStorage.getItem('token')}` + ...(authToken ? { 'Authorization': `Bearer ${authToken}` } : {}) }, body: JSON.stringify(requestBody) })backends/advanced/src/advanced_omi_backend/workers/memory_jobs.py (2)
190-221: enqueue_memory_processing still enqueues with the old signature (will crash).process_memory_job now accepts only conversation_id (+ injected deps). Update the enqueue wrapper accordingly.
-def enqueue_memory_processing( - client_id: str, - user_id: str, - user_email: str, - conversation_id: str, - priority: JobPriority = JobPriority.NORMAL -): +def enqueue_memory_processing( + conversation_id: str, + priority: JobPriority = JobPriority.NORMAL, +): @@ - job = memory_queue.enqueue( - process_memory_job, - client_id, - user_id, - user_email, - conversation_id, - job_timeout=timeout_mapping.get(priority, 1800), - result_ttl=JOB_RESULT_TTL, - job_id=f"memory_{conversation_id[:8]}", - description=f"Process memory for conversation {conversation_id[:8]}" - ) + job = memory_queue.enqueue( + process_memory_job, + conversation_id, + job_timeout=timeout_mapping.get(priority, 1800), + result_ttl=JOB_RESULT_TTL, + job_id=f"memory_{conversation_id[:8]}", + description=f"Process memory for conversation {conversation_id[:8]}", + )
97-115: Guard against segments being None in speaker filter.Iterating conversation_model.segments when it’s None raises TypeError.
- if user and user.primary_speakers: - transcript_speakers = set() - for segment in conversation_model.segments: + if user and user.primary_speakers: + transcript_speakers: set[str] = set() + for segment in (conversation_model.segments or []):backends/advanced/src/advanced_omi_backend/workers/audio_jobs.py (3)
153-166: Blocker: enqueue argument order/signatures don’t match updated job functions.
- transcribe_full_audio_job no longer takes user_id; user_id is currently passed as the trigger arg; "upload" lands in redis_client.
- recognise_speakers_job: user_id is being passed into transcript_text; [] lands in redis_client.
- process_cropping_job: extra user_id arg is passed into redis_client slot.
These will misroute parameters and likely crash at runtime.
Apply:
transcript_job = transcription_queue.enqueue( transcribe_full_audio_job, conversation_id, result["audio_uuid"], result["file_path"], result["version_id"], - user_id, - "upload", + "upload", job_timeout=600, result_ttl=JOB_RESULT_TTL, job_id=f"upload_{conversation_id[:8]}", description=f"Transcribe audio for {conversation_id[:8]}", meta={'audio_uuid': result["audio_uuid"], 'conversation_id': conversation_id} ) @@ speaker_job = transcription_queue.enqueue( recognise_speakers_job, conversation_id, result["version_id"], result["file_path"], - user_id, - "", # transcript_text - will be read from DB - [], # words - will be read from DB + "", # transcript_text - will be read from DB + [], # words - will be read from DB depends_on=transcript_job, job_timeout=600, result_ttl=JOB_RESULT_TTL, job_id=f"speaker_{conversation_id[:8]}", description=f"Recognize speakers for {conversation_id[:8]}", meta={'audio_uuid': result["audio_uuid"], 'conversation_id': conversation_id} ) @@ - cropping_job = default_queue.enqueue( + cropping_job = default_queue.enqueue( process_cropping_job, conversation_id, - result["file_path"], - user_id, + result["file_path"], depends_on=speaker_job, job_timeout=300, result_ttl=JOB_RESULT_TTL, job_id=f"crop_{conversation_id[:8]}", description=f"Crop audio for {conversation_id[:8]}", meta={'audio_uuid': result["audio_uuid"], 'conversation_id': conversation_id} )Also applies to: 170-184, 187-201
677-712: Align enqueue_cropping wrapper with job signature.process_cropping_job takes (conversation_id, audio_path, redis_client=None). Drop user_id from wrapper signature, docs, and enqueue call.
-def enqueue_cropping( - conversation_id: str, - audio_path: str, - user_id: str, - priority: JobPriority = JobPriority.NORMAL -): +def enqueue_cropping( + conversation_id: str, + audio_path: str, + priority: JobPriority = JobPriority.NORMAL +): @@ - job = default_queue.enqueue( + job = default_queue.enqueue( process_cropping_job, - conversation_id, - audio_path, - user_id, + conversation_id, + audio_path, job_timeout=timeout_mapping.get(priority, 180), result_ttl=JOB_RESULT_TTL, job_id=f"crop_{conversation_id[:12]}", description=f"Crop audio for conversation {conversation_id[:12]}", meta={'conversation_id': conversation_id} )
145-166: Stale call sites found with mismatched arguments—multiple files require fixes before merge.The repository contains several call sites with arguments that don't match the updated function signatures:
- audio_jobs.py lines 153, 170, 188, 203: Extra user_id/client_id and other arguments passed to transcribe_full_audio_job, recognise_speakers_job, process_cropping_job, and process_memory_job
- conversation_controller.py lines 656, 674: process_memory_job and recognise_speakers_job called with incorrect argument order
- queue_controller.py lines 449, 468: recognise_speakers_job and process_memory_job called with extra/misaligned arguments
These must be updated to match the correct signatures:
transcribe_full_audio_job(conversation_id, audio_uuid, audio_path, version_id, trigger="reprocess", redis_client=None)recognise_speakers_job(conversation_id, version_id, audio_path, transcript_text, words, redis_client=None)process_cropping_job(conversation_id, audio_path, redis_client=None)process_memory_job(conversation_id, redis_client=None)backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py (1)
217-241: Now I need to examine the actual codebase to see the complete enum definition and howprovider_nameis obtained. Let me search for this:Based on the web search results and initial script output, I now have enough context to verify the review comment. The evidence confirms:
- Calling an Enum with an invalid value raises ValueError
- A common pattern is to wrap enum instantiation in try/except to catch ValueError and provide fallback or error handling
- Enums support a
_missing_classmethod that must return a valid enum member to provide custom handlingThe review comment is valid—
provider_normalized = provider_name.lower()will fail at runtime if the lowercased provider name doesn't match a validTranscriptProviderenum value. The suggestion to add safe fallback handling is appropriate.Add safe fallback when mapping provider_name to TranscriptProvider enum.
Instantiating Enum with an invalid value raises ValueError. The code passes
provider_normalized(derived fromprovider_name.lower()) directly toConversation.TranscriptProvider()without validation. Ifprovider_namedoesn't normalize to a known enum value, this will crash at runtime.Wrap the enum instantiation in try/except and use a safe fallback (e.g.,
provider_normalized or "unknown") to prevent unhandled exceptions. Alternatively, add validation before the instantiation or implement_missing_on the enum to provide custom fallback behavior.
🧹 Nitpick comments (46)
extras/speaker-recognition/docker-compose.yml (1)
6-6: Consider impact ofplatform: linux/amd64constraint.The addition of
platform: linux/amd64restricts deployment to x86_64 Linux only, which may block ARM64-based systems (e.g., M1/M2 Macs, ARM instances, Graviton). Verify this constraint is intentional and necessary (e.g., for NVIDIA GPU support).If ARM64 support is desired, consider removing the constraint or making it conditional per profile.
Also applies to: 60-60
setup-requirements.txt (1)
3-3: Good fix! Consider pinning the version.Replacing
dotenvwithpython-dotenvis correct—they are different packages. However, consider adding a version constraint for reproducibility.Based on learnings, python-dotenv 1.1.1 is the latest stable version. Apply this diff:
-python-dotenv +python-dotenv>=1.0.0extras/local-omi-bt/requirements.txt (1)
1-10: Consider pinning versions for critical packages.Several packages lack version constraints, which can lead to compatibility issues and non-reproducible builds. Consider adding version constraints especially for network and audio processing libraries.
Apply this diff:
bleak==0.22.3 numpy>=1.26.4 scipy>=1.12.0 opuslib>=3.0.1 friend-lite-sdk easy_audio_interfaces -python-dotenv -asyncstdlib -httpx -websockets +python-dotenv>=1.0.0 +asyncstdlib>=0.3.2 +httpx>=0.27.0,<0.29.0 +websockets>=14.0,<16.0Based on learnings, these versions align with recent stable releases while allowing minor updates.
backends/advanced/src/advanced_omi_backend/utils/audio_utils.py (1)
276-309: Solid defensive validation logic.The segment validation correctly handles edge cases (invalid ranges, negative timestamps) and provides clear logging. The approach of clamping negative start times to 0.0 while skipping segments with negative end times is reasonable for handling boundary conditions.
One optional consideration: if a segment's start is significantly negative (e.g., < -1.0), it might indicate a data quality issue worth skipping entirely rather than clamping. However, the current approach is acceptable for minor calculation errors.
extras/speaker-recognition/init.py (5)
189-208: Reverse proxy config is still nginx; PR direction says Caddy.This block generates nginx.conf, which conflicts with the PR’s “replace nginx with Caddyfile” goal. Switch to Caddyfile templating (e.g., Caddyfile.template → Caddyfile) or gate nginx generation behind a flag.
Apply this diff to pivot to Caddyfile templating in-place:
- # Generate nginx.conf from template - self.console.print("[blue][INFO][/blue] Creating nginx configuration...") - nginx_template = Path("nginx.conf.template") - if nginx_template.exists(): - try: - with open(nginx_template, 'r') as f: - nginx_content = f.read() - # Replace TAILSCALE_IP with server_ip - nginx_content = nginx_content.replace('TAILSCALE_IP', server_ip) - with open('nginx.conf', 'w') as f: - f.write(nginx_content) - self.console.print(f"[green][SUCCESS][/green] nginx.conf created for: {server_ip}") - except Exception as e: - self.console.print(f"[yellow][WARNING][/yellow] nginx.conf generation failed: {e}") - else: - self.console.print("[yellow][WARNING][/yellow] nginx.conf.template not found") + # Generate Caddyfile from template + self.console.print("[blue][INFO][/blue] Creating Caddyfile configuration...") + caddy_template = Path("Caddyfile.template") + if caddy_template.exists(): + try: + content = caddy_template.read_text() + content = content.replace('TAILSCALE_IP', server_ip) + Path('Caddyfile').write_text(content) + self.console.print(f"[green][SUCCESS][/green] Caddyfile created for: {server_ip}") + except Exception as e: + self.console.print(f"[yellow][WARNING][/yellow] Caddyfile generation failed: {e}") + else: + self.console.print("[yellow][WARNING][/yellow] Caddyfile.template not found")
175-188: Validate server_ip before passing to subprocess and templating.Limit input to safe hostname/IP characters to avoid bad values being used in file writes or external calls.
Add this helper (outside this block):
import re SAFE_HOST_RE = re.compile(r'^[A-Za-z0-9\.\-:]+$') def _validate_server_name(name: str) -> str: if not SAFE_HOST_RE.match(name): raise ValueError(f"Invalid server name: {name!r}") return nameThen apply this diff here:
- server_ip = self.prompt_value("Server IP/Domain for SSL certificate", "localhost") + server_ip = _validate_server_name( + self.prompt_value("Server IP/Domain for SSL certificate", "localhost") + )
225-236: Duplicate env backup logic; call the helper once.You defined backup_existing_env but also re-implemented backup here. Prefer single path.
- # Backup existing .env if it exists - if env_path.exists(): - timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") - backup_path = env_path.with_suffix(f'.backup.{timestamp}') - shutil.copy2(env_path, backup_path) - self.console.print(f"[blue][INFO][/blue] Backed up existing .env to {backup_path.name}") + # Backup existing .env if it exists + self.backup_existing_env()
17-23: Remove unused import.
rprintisn’t used.-from rich import print as rprint
331-333: Over-broad exception; include traceback.Keep the top-level guard but log details for debugging.
- except Exception as e: - self.console.print(f"[red][ERROR][/red] Setup failed: {e}") + except Exception as e: + # Print message plus traceback to help diagnose failures + self.console.print(f"[red][ERROR][/red] Setup failed: {e}") + import traceback; self.console.print(traceback.format_exc()) sys.exit(1)CLAUDE.md (1)
168-172: Update reverse proxy reference to Caddy.Docs still mention nginx; align with PR’s Caddy move.
- - Nginx Proxy: Load balancing and routing + - Caddy reverse proxy: TLS and routingextras/local-omi-bt/send_to_adv.py (1)
65-73: Use logger.exception for caught exceptions.Improves debuggability with tracebacks.
- except httpx.TimeoutException: - logger.error("Authentication request timed out") + except httpx.TimeoutException: + logger.exception("Authentication request timed out") return None - except httpx.RequestError as e: - logger.error(f"Authentication request failed: {e}") + except httpx.RequestError: + logger.exception("Authentication request failed") return None - except Exception as e: - logger.error(f"Unexpected authentication error: {e}") + except Exception: + logger.exception("Unexpected authentication error") return Nonebackends/advanced/src/advanced_omi_backend/controllers/websocket_controller.py (3)
24-24: Unused import.
process_audio_chunkisn’t referenced; remove to avoid confusion.-from advanced_omi_backend.utils.audio_utils import process_audio_chunk
728-731: Remove unused variable.
version_idis created but not used.- version_id = str(uuid.uuid4())
161-205: Redis SCAN over all sessions can be heavy.Consider maintaining
sessions_by_client:{client_id}sets or keying sessions asaudio:session:{client_id}:{session_id}to avoid full-keyspace scans.I can draft a small helper to track sessions per client and update cleanup accordingly.
extras/local-omi-bt/connect-omi.py (4)
98-100: Remove unused variable binding.
test_clientisn’t used.- async with BleakClient(OMI_MAC) as test_client: + async with BleakClient(OMI_MAC):
72-79: Narrow exception and improve logging.Queue won’t overflow without maxsize; drop try/except or catch specific exceptions and include traceback when unexpected.
- if decoded_pcm: - try: - audio_queue.put_nowait(decoded_pcm) - except Exception as e: - logger.error("Queue Error: %s", e) + if decoded_pcm: + audio_queue.put_nowait(decoded_pcm)
27-29: Avoid sys.path mutation for local imports.Prefer a package or relative import; or run as a module. For scripts, consider
python -m extras.local-omi-bt.connect-omi.
133-146: Backend streaming wrapper shutdown.Consider adding a timeout/heartbeat to avoid indefinite waits if producer stalls; also handle task cancellation for graceful exit.
Would you like a small cancellation/timeout pattern added here (e.g.,
asyncio.wait_forwith periodic sentinel checks)?backends/advanced/setup-requirements.txt (1)
4-4: Consider pinning python-dotenv version.The addition of
python-dotenvaligns with the new wizard.py-based configuration flow. However, consider pinning to a specific version for reproducible builds:-python-dotenv +python-dotenv>=1.0.0,<2.0.0This ensures consistent behavior across environments while allowing patch updates.
wizard.py (3)
161-165: Ruff S105 false positive on hf_token.Not a hardcoded secret; it reads from .env. To silence noise, either rename to hf_token_value or add inline
# noqa: S105.- hf_token = read_env_value(speaker_env_path, 'HF_TOKEN') + hf_token_value = read_env_value(speaker_env_path, 'HF_TOKEN') # noqa: S105 - if hf_token and hf_token != 'your_huggingface_token_here': - cmd.extend(['--hf-token', hf_token]) + if hf_token_value and hf_token_value != 'your_huggingface_token_here': + cmd.extend(['--hf-token', hf_token_value])
187-205: Surface richer failure context and configurable timeout.Use an env-driven timeout and include stderr on failures. At minimum, make timeout configurable; optionally capture stderr to print on CalledProcessError.
- result = subprocess.run( - cmd, - cwd=service['path'], - check=True, - timeout=300 # 5 minute timeout for service setup - ) + import os + setup_timeout = int(os.getenv("FRIEND_SETUP_TIMEOUT", "600")) + result = subprocess.run( + cmd, + cwd=service['path'], + check=True, + timeout=setup_timeout + ) @@ - except subprocess.CalledProcessError as e: - console.print(f"❌ {service_name} setup failed with exit code {e.returncode}") + except subprocess.CalledProcessError as e: + console.print(f"❌ {service_name} setup failed with exit code {e.returncode}. Check above for script output.")
332-332: Remove extraneous f-string.No interpolation here.
-console.print(f"[dim] cd extras/speaker-recognition && uv run --with-requirements setup-requirements.txt python init.py[/dim]") +console.print("[dim] cd extras/speaker-recognition && uv run --with-requirements setup-requirements.txt python init.py[/dim]")backends/advanced/Caddyfile.template (1)
16-16: Avoid duplicate host when server_ip is localhost.Template uses
localhost TAILSCALE_IP. When server_ip islocalhost, generated site label becomeslocalhost localhost. It works but is messy. Prefer generating hosts in init.py based on server_ip. See suggested change there.backends/advanced/init.py (3)
340-359: Generate host label without duplicates for Caddyfile.If server_ip is
localhostor127.0.0.1, replacelocalhost TAILSCALE_IPwithlocalhost; otherwise include both.- # Replace TAILSCALE_IP with server_ip - caddyfile_content = caddyfile_content.replace('TAILSCALE_IP', server_ip) + # Build host label without duplicates + hosts = 'localhost' if server_ip in ('localhost', '127.0.0.1') else f'localhost {server_ip}' + caddyfile_content = caddyfile_content.replace('localhost TAILSCALE_IP', hosts)
356-359: Narrow overly broad exception.Catching bare Exception hides actionable errors. Limit to OSError for file IO, and surface message.
- except Exception as e: - self.console.print(f"[yellow][WARNING][/yellow] Caddyfile generation failed: {e}") + except OSError as e: + self.console.print(f"[yellow][WARNING][/yellow] Caddyfile generation failed: {e.strerror or e}")
203-205: Use UTC consistently for timestamps.Elsewhere you default to utcnow; keep version timestamps UTC too.
- created_at=datetime.now(), + created_at=datetime.utcnow(), @@ - created_at=datetime.now(), + created_at=datetime.utcnow(),Also applies to: 234-237
backends/advanced/src/advanced_omi_backend/models/conversation.py (3)
33-38: Enum defined but no status field; add one or remove enum.If status is part of the new lifecycle, persist it; else drop the enum to avoid confusion.
class Conversation(Document): @@ - class ConversationStatus(str, Enum): + class ConversationStatus(str, Enum): """Conversation processing status.""" ACTIVE = "active" # Has running jobs or open websocket COMPLETED = "completed" # All jobs succeeded FAILED = "failed" # One or more jobs failed @@ - # Core identifiers + # Core identifiers conversation_id: Indexed(str, unique=True) = Field(description="Unique conversation identifier") audio_uuid: Indexed(str) = Field(description="Link to audio_chunks collection") user_id: Indexed(str) = Field(description="User who owns this conversation") client_id: Indexed(str) = Field(description="Client device identifier") + status: "Conversation.ConversationStatus" = Field( + default=ConversationStatus.ACTIVE, + description="Conversation processing status" + )
76-79: Store only normalized, relative paths.Since these are relative-to-CHUNK_DIR paths, ensure writers sanitize and normalize them (no leading “/”, no “..”). Add a brief validator or enforce at the controller layer.
203-205: UTC consistency for version timestamps.Match created_at default (utcnow). See diff in init.py comment; same change applies here.
Also applies to: 234-237
backends/advanced/webui/src/pages/Conversations.tsx (1)
104-109: Timezone detection is brittle; normalize robustly.The current check can misclassify ISO strings. Normalize by inserting ‘T’ and appending ‘Z’ only when no timezone is present.
- if (typeof timestamp === 'string') { - // If the string doesn't include timezone info, append 'Z' to treat as UTC - const isoString = timestamp.endsWith('Z') || timestamp.includes('+') || timestamp.includes('T') && timestamp.split('T')[1].includes('-') - ? timestamp - : timestamp + 'Z' - return new Date(isoString).toLocaleString() - } + if (typeof timestamp === 'string') { + const s = timestamp.replace(' ', 'T'); + const hasTz = /Z$|[+-]\d{2}:\d{2}$/.test(s); + const iso = hasTz ? s : `${s}Z`; + return new Date(iso).toLocaleString(); + }quickstart.md (2)
25-36: Use proper headings instead of bold text (fixes MD036).Convert the “Option A/B” and “Optional Add‑ons” labels to real headings for accessibility and markdownlint compliance.
-**Option A: Cloud Services (Easiest - Recommended for Beginners)** +#### Option A: Cloud Services (Easiest — Recommended for Beginners) -**Option B: Local Services (Free but More Complex)** +#### Option B: Local Services (Free but More Complex) -**Optional Add-ons (Both Paths)** +#### Optional Add-ons (Both Paths)
76-103: Make provider sections real subheadings (also MD036).Deepgram/OpenAI/Hugging Face/Local “No API keys needed!” lines are styled as bold emphasis; promote them to headings.
-**Deepgram (Speech-to-Text)** +#### Deepgram (Speech-to-Text) -**OpenAI (AI Brain)** +#### OpenAI (AI Brain) -**Optional: Hugging Face (Speaker Recognition)** +#### Optional: Hugging Face (Speaker Recognition) -**No API keys needed!** Everything runs on your computer. +#### No API keys needed! +Everything runs on your computer.backends/advanced/webui/src/services/api.ts (2)
201-204: Comment mismatch: not SSE.sendMessage returns a fetch Promise, not an EventSource. Update the comment or switch to EventSource/stream reader.
- // Streaming chat (returns EventSource for Server-Sent Events) + // Chat send (HTTP POST). For streaming responses, use EventSource/ReadableStream if backend supports SSE.
16-18: Minor wording update.Comment says “nginx proxy”; this repo moved to Caddy. Consider generic “reverse proxy/Ingress” to avoid drift.
- // We're being accessed through nginx proxy or Kubernetes Ingress, use same origin + // Accessed through a reverse proxy (e.g., Caddy/Nginx) or Kubernetes Ingress — use same originbackends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py (3)
23-24: Type: prefer explicit optional.PEP 484/modern style: use str | None for optional.
- speech_job_id: str = None, + speech_job_id: str | None = None,
84-86: Broadened except hides root cause.Log full stack for telemetry; optionally narrow exception types (e.g., NoSuchJobError).
- except Exception as e: - logger.warning(f"⚠️ Failed to update speech job metadata: {e}") + except Exception: + logger.exception("⚠️ Failed to update speech job metadata")
258-263: Audio wait log likely never triggers (float modulo).Using
elapsed % 5 == 0with floats rarely equals zero. Track last log time instead.- elapsed = time.time() - wait_start - if elapsed % 5 == 0: # Log every 5 seconds - logger.info(f"⏳ Waiting for audio file (conversation {conversation_id[:12]})... ({elapsed:.0f}s elapsed)") + elapsed = time.time() - wait_start + # Log roughly every 5s + if int(elapsed) % 5 == 0 and int(elapsed) != int(getattr(locals(), "last_audio_wait_log_s", -1)): + logger.info(f"⏳ Waiting for audio file (conversation {conversation_id[:12]})... ({elapsed:.0f}s elapsed)") + last_audio_wait_log_s = int(elapsed)backends/advanced/src/advanced_omi_backend/routers/modules/queue_routes.py (3)
98-136: Improve error reporting in cancel_job.Use logger.exception and chain the HTTPException for better debuggability.
- except Exception as e: - logger.error(f"Failed to cancel/delete job {job_id}: {e}") - raise HTTPException(status_code=404, detail=f"Job not found or could not be cancelled: {str(e)}") + except Exception as e: + logger.exception("Failed to cancel/delete job %s", job_id) + raise HTTPException(status_code=404, detail="Job not found or could not be cancelled") from e
920-945: Timestamp source.Prefer time.time() to avoid event loop coupling.
- "timestamp": asyncio.get_event_loop().time() + "timestamp": __import__("time").time()
223-235: Avoid queue name duplication.Define queues once (module-level constant) to keep endpoints in sync.
+QUEUES = ("default", "transcription", "memory", "audio") ... - queues = ["default", "transcription", "memory", "audio"] + queues = QUEUES ... - queues = ["default", "transcription", "memory", "audio"] + queues = QUEUES ... - queues = ["default", "transcription", "memory", "audio"] + queues = QUEUESAlso applies to: 539-546, 732-737, 820-821
backends/advanced/src/advanced_omi_backend/workers/memory_jobs.py (1)
63-71: Avoid duplicate user lookups and improve exception detail.Reuse the fetched user and log stack trace when memory details fetch fails.
- user = await get_user_by_id(user_id) - if user: - user_email = user.email + user = await get_user_by_id(user_id) + if user: + user_email = user.email @@ - user = await get_user_by_id(user_id) - if user and user.primary_speakers: + if user and user.primary_speakers:- except Exception as e: - logger.warning(f"Failed to fetch memory details for UI: {e}") + except Exception: + logger.exception("Failed to fetch memory details for UI")Also applies to: 98-103
backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py (2)
407-411: Remove f-strings without interpolation (Ruff F541).Several logger calls are f-strings with no placeholders. Drop the f prefix to satisfy linters and avoid minor overhead.
- logger.info(f"🎤 Calling speaker recognition service...") + logger.info("🎤 Calling speaker recognition service...") - logger.info(f"🛑 Session ended, exiting") + logger.info("🛑 Session ended, exiting") - logger.warning(f"⏱️ Max runtime reached, exiting") + logger.warning("⏱️ Max runtime reached, exiting") - logger.info(f"💬 Meaningful speech detected!") + logger.info("💬 Meaningful speech detected!") - logger.info(f"🎤 Checking for enrolled speakers...") + logger.info("🎤 Checking for enrolled speakers...") - logger.info(f"⏭️ No enrolled speakers, continuing to listen...") + logger.info("⏭️ No enrolled speakers, continuing to listen...") - logger.info(f"✅ Session ended without speech") + logger.info("✅ Session ended without speech")Also applies to: 579-579, 583-583, 606-606, 610-610, 623-623, 692-692
182-187: Optional: sample-rate correctness.You hardcode sample_rate=16000 while reading arbitrary WAVs. If upstream files aren’t guaranteed 16 kHz, read the actual rate from the header or resample before sending to the provider to avoid ASR quality regressions.
backends/advanced/src/advanced_omi_backend/workers/audio_jobs.py (3)
342-344: Prefer logging.exception for failure paths.Logs stacktrace automatically and matches Ruff TRY400 guidance.
- except Exception as e: - logger.error(f"❌ RQ: Audio cropping failed for conversation {conversation_id}: {e}") + except Exception: + logger.exception(f"❌ RQ: Audio cropping failed for conversation {conversation_id}") raise
547-549: Rename unused loop variable (Ruff B007).stream_name isn’t used; rename to underscore.
- for stream_name, msgs in audio_messages: + for _stream_name, msgs in audio_messages:
571-576: Minor: throttle log spam safely.total_chunk_count modulo guard is fine. Optionally add session_id and conversation_id as fields to structured logs for easier filtering.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
backends/advanced/uv.lockis excluded by!**/*.lockextras/speaker-recognition/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (43)
.gitignore(1 hunks)CLAUDE.md(8 hunks)Docs/getting-started.md(1 hunks)Docs/init-system.md(3 hunks)backends/advanced/Caddyfile.template(1 hunks)backends/advanced/init.py(2 hunks)backends/advanced/nginx.conf(0 hunks)backends/advanced/setup-requirements.txt(1 hunks)backends/advanced/src/advanced_omi_backend/client.py(1 hunks)backends/advanced/src/advanced_omi_backend/controllers/audio_controller.py(1 hunks)backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py(1 hunks)backends/advanced/src/advanced_omi_backend/controllers/websocket_controller.py(4 hunks)backends/advanced/src/advanced_omi_backend/conversation_manager.py(0 hunks)backends/advanced/src/advanced_omi_backend/models/conversation.py(2 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/queue_routes.py(8 hunks)backends/advanced/src/advanced_omi_backend/services/audio_stream/aggregator.py(2 hunks)backends/advanced/src/advanced_omi_backend/services/audio_stream/consumer.py(1 hunks)backends/advanced/src/advanced_omi_backend/services/audio_stream/producer.py(1 hunks)backends/advanced/src/advanced_omi_backend/services/transcription/deepgram.py(3 hunks)backends/advanced/src/advanced_omi_backend/speaker_recognition_client.py(1 hunks)backends/advanced/src/advanced_omi_backend/utils/audio_utils.py(1 hunks)backends/advanced/src/advanced_omi_backend/utils/conversation_utils.py(1 hunks)backends/advanced/src/advanced_omi_backend/workers/audio_jobs.py(9 hunks)backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py(8 hunks)backends/advanced/src/advanced_omi_backend/workers/memory_jobs.py(4 hunks)backends/advanced/src/advanced_omi_backend/workers/rq_worker_entry.py(1 hunks)backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py(6 hunks)backends/advanced/webui/src/pages/Conversations.tsx(6 hunks)backends/advanced/webui/src/services/api.ts(1 hunks)extras/local-omi-bt/README.md(1 hunks)extras/local-omi-bt/connect-omi.py(1 hunks)extras/local-omi-bt/requirements.txt(1 hunks)extras/local-omi-bt/send_to_adv.py(1 hunks)extras/local-omi-bt/start.sh(1 hunks)extras/speaker-recognition/Dockerfile(2 hunks)extras/speaker-recognition/docker-compose.yml(2 hunks)extras/speaker-recognition/init.py(1 hunks)extras/speaker-recognition/pyproject.toml(3 hunks)extras/speaker-recognition/setup-requirements.txt(1 hunks)extras/speaker-recognition/webui/Dockerfile(1 hunks)quickstart.md(5 hunks)setup-requirements.txt(1 hunks)wizard.py(6 hunks)
💤 Files with no reviewable changes (2)
- backends/advanced/nginx.conf
- backends/advanced/src/advanced_omi_backend/conversation_manager.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-18T01:42:08.562Z
Learnt from: CR
PR: AnkushMalaker/friend-lite#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T01:42:08.562Z
Learning: Applies to backends/advanced/src/**/*.py : Use conversation_id (not audio_uuid) for all reprocessing operations and version management
Applied to files:
backends/advanced/webui/src/pages/Conversations.tsxbackends/advanced/src/advanced_omi_backend/controllers/conversation_controller.pybackends/advanced/src/advanced_omi_backend/workers/conversation_jobs.pybackends/advanced/src/advanced_omi_backend/workers/memory_jobs.pybackends/advanced/src/advanced_omi_backend/models/conversation.pybackends/advanced/src/advanced_omi_backend/workers/transcription_jobs.pybackends/advanced/src/advanced_omi_backend/controllers/websocket_controller.py
🪛 Gitleaks (8.28.0)
Docs/getting-started.md
[high] 362-368: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
[high] 226-226: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
CLAUDE.md
[high] 320-326: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
🪛 LanguageTool
Docs/getting-started.md
[style] ~9-~9: For conciseness, consider replacing this expression with an adverb.
Context: ...ort "AI wearable" agents/functionality. At the moment, the basic functionalities are: - Audio...
(AT_THE_MOMENT)
[grammar] ~317-~317: Use a hyphen to join words.
Context: ...processing - Collection timeout: 1.5 minute collection for optimal online pro...
(QB_NEW_EN_HYPHEN)
[style] ~476-~476: The double modal “needed Encrypted” is nonstandard (only accepted in certain dialects). Consider “to be Encrypted”.
Context: ...x.x.x) - no CORS configuration needed - Encrypted inter-service communication **Service ...
(NEEDS_FIXED)
CLAUDE.md
[grammar] ~7-~7: Use a hyphen to join words.
Context: ...at the very least audio capture, speaker specific transcription, memory extractio...
(QB_NEW_EN_HYPHEN)
🪛 markdownlint-cli2 (0.18.1)
Docs/getting-started.md
46-46: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
530-530: Bare URL used
(MD034, no-bare-urls)
quickstart.md
25-25: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
30-30: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
35-35: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
76-76: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
82-82: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
88-88: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
102-102: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
CLAUDE.md
99-99: Bare URL used
(MD034, no-bare-urls)
🪛 Ruff (0.14.1)
backends/advanced/init.py
356-356: Do not catch blind exception: Exception
(BLE001)
extras/local-omi-bt/send_to_adv.py
60-60: Do not use bare except
(E722)
60-61: try-except-pass detected, consider logging the exception
(S110)
66-66: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
69-69: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
71-71: Do not catch blind exception: Exception
(BLE001)
72-72: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py
23-23: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
84-84: Do not catch blind exception: Exception
(BLE001)
266-266: f-string without any placeholders
Remove extraneous f prefix
(F541)
backends/advanced/src/advanced_omi_backend/workers/rq_worker_entry.py
1-1: Shebang is present but file is not executable
(EXE001)
backends/advanced/src/advanced_omi_backend/workers/memory_jobs.py
162-162: Do not catch blind exception: Exception
(BLE001)
extras/local-omi-bt/connect-omi.py
72-72: Unused function argument: sender
(ARG001)
77-77: Do not catch blind exception: Exception
(BLE001)
78-78: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
98-98: Local variable test_client is assigned to but never used
Remove assignment to unused variable test_client
(F841)
100-100: Do not catch blind exception: Exception
(BLE001)
101-101: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
102-102: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/routers/modules/queue_routes.py
101-101: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
111-111: Abstract raise to an inner function
(TRY301)
133-133: Do not catch blind exception: Exception
(BLE001)
134-134: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
135-135: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
135-135: Use explicit conversion flag
Replace with conversion flag
(RUF010)
562-562: Do not catch blind exception: Exception
(BLE001)
568-568: Do not catch blind exception: Exception
(BLE001)
574-574: Do not catch blind exception: Exception
(BLE001)
575-575: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
701-701: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
701-701: Use explicit conversion flag
Replace with conversion flag
(RUF010)
708-708: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
782-782: Do not catch blind exception: Exception
(BLE001)
786-786: Consider moving this statement to an else block
(TRY300)
787-787: Do not catch blind exception: Exception
(BLE001)
788-788: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
795-795: Do not catch blind exception: Exception
(BLE001)
796-796: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
804-804: Do not catch blind exception: Exception
(BLE001)
805-805: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
816-816: Local variable conversation_ids is assigned to but never used
Remove assignment to unused variable conversation_ids
(F841)
853-853: Loop control variable status_name not used within loop body
Rename unused status_name to _status_name
(B007)
891-891: Do not catch blind exception: Exception
(BLE001)
895-895: Consider moving this statement to an else block
(TRY300)
896-896: Do not catch blind exception: Exception
(BLE001)
897-897: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
949-949: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
949-949: Use explicit conversion flag
Replace with conversion flag
(RUF010)
backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py
159-159: Avoid specifying long messages outside the exception class
(TRY003)
176-176: Avoid specifying long messages outside the exception class
(TRY003)
578-578: f-string without any placeholders
Remove extraneous f prefix
(F541)
582-582: f-string without any placeholders
Remove extraneous f prefix
(F541)
605-605: f-string without any placeholders
Remove extraneous f prefix
(F541)
610-610: f-string without any placeholders
Remove extraneous f prefix
(F541)
623-623: f-string without any placeholders
Remove extraneous f prefix
(F541)
692-692: f-string without any placeholders
Remove extraneous f prefix
(F541)
backends/advanced/src/advanced_omi_backend/controllers/websocket_controller.py
150-150: Do not catch blind exception: Exception
(BLE001)
158-158: Do not catch blind exception: Exception
(BLE001)
168-168: f-string without any placeholders
Remove extraneous f prefix
(F541)
206-206: Do not catch blind exception: Exception
(BLE001)
extras/speaker-recognition/init.py
180-180: subprocess call: check for execution of untrusted input
(S603)
204-204: Do not catch blind exception: Exception
(BLE001)
211-211: f-string without any placeholders
Remove extraneous f prefix
(F541)
212-212: f-string without any placeholders
Remove extraneous f prefix
(F541)
213-213: f-string without any placeholders
Remove extraneous f prefix
(F541)
214-214: f-string without any placeholders
Remove extraneous f prefix
(F541)
265-265: f-string without any placeholders
Remove extraneous f prefix
(F541)
331-331: Do not catch blind exception: Exception
(BLE001)
wizard.py
163-163: Possible hardcoded password assigned to: "hf_token"
(S105)
332-332: f-string without any placeholders
Remove extraneous f prefix
(F541)
backends/advanced/src/advanced_omi_backend/workers/audio_jobs.py
237-237: Unused function argument: redis_client
(ARG001)
268-268: Abstract raise to an inner function
(TRY301)
268-268: Avoid specifying long messages outside the exception class
(TRY003)
343-343: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
350-350: Unused function argument: user_id
(ARG001)
434-434: f-string without any placeholders
Remove extraneous f prefix
(F541)
448-448: Loop control variable stream_name not used within loop body
Rename unused stream_name to _stream_name
(B007)
465-465: Do not catch blind exception: Exception
(BLE001)
🪛 Shellcheck (0.11.0)
extras/local-omi-bt/start.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: integration-tests
🔇 Additional comments (34)
extras/speaker-recognition/pyproject.toml (3)
41-44: Well-structured CPU extra declaration.The new
cpuoptional-dependency group properly mirrors the existing CUDA variants (cu121, cu126, cu128) with consistent version constraints. This enables users to opt into CPU-only PyTorch wheels when needed.
64-64: Correct conflict and source routing configuration.The cpu extra is properly added to the
uv.conflictsarray, ensuring mutual exclusivity with CUDA variants. Torch and torchaudio sources correctly route to thepytorch-cpuindex when the cpu extra is selected, maintaining consistency with the CUDA routing pattern.Also applies to: 73-73, 79-79
85-88: Valid and secure PyTorch CPU index configuration.The
pytorch-cpuindex URL (https://download.pytorch.org/whl/cpu) correctly points to the official PyTorch CPU wheels repository. Theexplicit = trueflag prevents accidental wheel resolution from other package indices, which is a best practice for dependency integrity.backends/advanced/src/advanced_omi_backend/services/audio_stream/consumer.py (1)
580-583: LGTM! Appropriate log level adjustment.Changing this routine operation from info to debug reduces log noise while preserving detailed tracing for debugging. This is consistent with logging best practices where info level is reserved for significant events.
backends/advanced/src/advanced_omi_backend/services/audio_stream/aggregator.py (2)
71-74: LGTM! Appropriate log level for detailed operational data.This detailed result information (chunk IDs, text lengths) is useful for debugging but too verbose for info level. Debug is the right level for this granular operational tracing.
146-149: LGTM! Consistent with logging best practices.Moving this aggregation detail to debug level appropriately reserves info logs for more significant events while maintaining detailed tracing capability.
backends/advanced/src/advanced_omi_backend/services/audio_stream/producer.py (1)
251-255: LGTM! Improved log hygiene for high-frequency operations.Even with the existing throttling (every 10th chunk), this still generates frequent logs during active streaming. Debug level is more appropriate for this detailed buffer state information.
extras/speaker-recognition/webui/Dockerfile (1)
19-23: Verify port configuration is passed to Vite dev server.The removal of EXPOSE and reliance on
REACT_UI_PORTenv var is documented, but the CMD does not explicitly pass--portto the dev server. Vite requires either an explicit--portflag, env var reading invite.config.ts, or npm script handling to bind to a non-default port.Please confirm that one of the following is true:
vite.config.tsreads and applies theREACT_UI_PORTenv var to the dev server configuration, or- The npm script
devinpackage.jsonextracts and passes--port ${REACT_UI_PORT}to Vite, or- Vite is configured to accept the port via another mechanism.
If none of these are in place, the container will always bind to the default port (5173) regardless of
REACT_UI_PORT, causing port mapping failures.extras/speaker-recognition/docker-compose.yml (1)
10-11: CUDA_VERSION build args are well-structured.The introduction of
CUDA_VERSIONbuild args with sensible defaults ("cpu" for CPU profile, "cu121" for GPU profile) and env var override support is clean. This enables flexible build-time selection of CUDA dependencies.Also applies to: 48-49
extras/speaker-recognition/Dockerfile (2)
5-6: ENV capture of CUDA_VERSION is correctly implemented.Setting
ENV CUDA_VERSION=${CUDA_VERSION}from the build arg persists the value to runtime, enabling the CMD to reference${CUDA_VERSION}for dynamic extra selection. This is the correct pattern.
44-45: Shell form CMD is appropriately used for runtime variable expansion.Converting to shell form enables
${CUDA_VERSION}expansion at runtime, which is necessary given the dynamic extra selection based on the build arg. The comment clearly explains the rationale. The variable is constrained to safe values (cpu, cu121) with no user input, so there is no command injection risk.backends/advanced/src/advanced_omi_backend/services/transcription/deepgram.py (2)
56-56: Logging level change is reasonable for this low-level detail.Moving this to debug level is appropriate since it logs a technical detail (byte count). Higher-level operation logs (timeout configuration, results) remain at info level, maintaining adequate production observability.
267-267: Appropriate debug-level logging for connection detail.This change is well-reasoned: the high-level operation "Starting Deepgram streaming" remains at info level (line 233), while the technical connection detail moves to debug. This maintains adequate production visibility for the streaming lifecycle.
.gitignore (1)
76-77: LGTM!Adding generated configuration files to
.gitignoreis appropriate for the nginx→Caddyfile migration.extras/speaker-recognition/setup-requirements.txt (1)
1-3: LGTM!Dependency declarations are appropriate with reasonable version constraints.
extras/local-omi-bt/README.md (1)
1-3: LGTM!Usage instructions are clear and align with the startup script.
backends/advanced/src/advanced_omi_backend/utils/audio_utils.py (2)
270-273: LGTM!Updated docstring clearly explains that segments are already in relative format—good clarification.
311-316: Correctly preserves original timestamps for database.Good design: using
validated_segmentsfor ffmpeg processing while passing originalspeech_segmentsto the database preserves the complete audit trail.backends/advanced/src/advanced_omi_backend/controllers/websocket_controller.py (1)
743-751: Confirm job chain uses conversation_id as the primary key.Per prior learning, reprocessing/version ops should key off conversation_id. Ensure downstream jobs don’t rely on audio_uuid as the primary identifier.
Based on learnings
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (1)
13-15: LGTM! Import path updated consistently.The import path change from
advanced_omi_backend.audio_utilstoadvanced_omi_backend.utils.audio_utilsaligns with the broader module restructuring in this PR, where audio utilities are being consolidated under theutilsnamespace.backends/advanced/src/advanced_omi_backend/speaker_recognition_client.py (1)
478-478: LGTM! Consistent import path update.The import path change for
write_pcm_to_wavmatches the module restructuring applied throughout this PR.backends/advanced/src/advanced_omi_backend/controllers/audio_controller.py (1)
15-15: LGTM! Import refactor completed consistently.The updated import path for audio utilities is consistent with the module reorganization across the codebase.
Docs/init-system.md (1)
15-15: LGTM! Documentation updated to reflect wizard.py entry point.The documentation correctly references the new
wizard.pyorchestrator across all sections, maintaining consistency with the architectural changes in this PR.Also applies to: 25-25, 42-42
backends/advanced/src/advanced_omi_backend/client.py (1)
135-147: LGTM! Architectural simplification aligns with V2 design.The simplified
close_current_conversationmethod correctly removes the ConversationManager orchestration (which was deleted in this PR) and maintains minimal state cleanup for backward compatibility. The comments clearly document the V1/V2 architectural transition and explain that conversation closure is now handled by websocket controllers using RQ jobs.Docs/getting-started.md (1)
1-730: Excellent comprehensive documentation!This new getting-started guide provides thorough coverage of prerequisites, setup workflows, configuration options, and troubleshooting. The structure is clear with logical progression from quick start to advanced topics.
backends/advanced/src/advanced_omi_backend/utils/conversation_utils.py (1)
16-46: LGTM! Clean abstraction for speech detection.The new
is_meaningful_speechfunction provides a convenient wrapper around the existinganalyze_speechlogic, making it easier to check for meaningful speech without handling the full analysis result dict. The implementation correctly handles edge cases (empty text) and maintains good separation of concerns by delegating to the existing analysis function.backends/advanced/webui/src/pages/Conversations.tsx (2)
214-250: Switch to conversation_id for transcript expansion – LGTM.State keys, fetch-by-id, and UI toggles are consistent and scoped by conversation_id.
If the list API can omit segments, ensure getById returns segments/transcript for the same id variant.
Also applies to: 565-565, 574-579, 583-584
187-212: No changes needed; delete operation correctly uses audio_uuid throughout the stack.The frontend correctly passes
audioUuidto the delete endpoint, the backend route accepts it asaudio_uuid, and the controller uses it to fetch and delete the audio chunk. No mismatch exists.wizard.py (1)
152-170: Update suggested fix to include target script modifications; clarify that env vars alone won't work without corresponding changes in init.py and setup.sh.The security concern is valid—secrets passed via CLI args are visible in process lists (e.g.,
ps aux). However, the suggested diff is incomplete. The target scripts currently parse these as CLI arguments:
extras/speaker-recognition/init.pyaccepts--hf-token,--compute-mode,--deepgram-api-keyvia argparseextras/openmemory-mcp/setup.shaccepts--openai-api-keyand writes it to .envSimply passing secrets via subprocess
envdict (without modifying the target scripts) won't work—the receiving scripts must be updated to read from environment variables instead of (or in addition to) CLI args. Additionally, the conditionalenv=env if service_name == 'speaker-recognition' else Nonein the suggested diff is inconsistent and would break openmemory-mcp.To proceed:
- Modify
init.pyandsetup.shto read secrets from environment variables first, falling back to CLI args for backward compatibility- Update wizard.py to pass secrets via subprocess env dict
- Consider whether COMPUTE_MODE (non-secret configuration) should remain as a CLI arg
Would you like to coordinate these target script changes, or should we accept the current approach pending security review?
backends/advanced/webui/src/services/api.ts (1)
145-159: No changes required—cleanup endpoints are correctly implemented.The backend routes
/api/streaming/cleanupand/api/streaming/cleanup-sessionsexist at lines 139 and 145 respectively in system_routes.py. The frontend calls are properly aligned with server-side endpoints. The suggested refactoring is unnecessary.quickstart.md (1)
116-117: No issues found—flag syntax is correct.The
--with-requirementsflag is valid foruv runand the usage in quickstart.md matches the documented syntax. The flag is supported to install packages from a requirements.txt file for that invocation.backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py (3)
156-164: Good: explicit conversation lookup and path validation.Fetching by conversation_id and validating the audio file path is solid. Minor: the FileNotFoundError message is fine; no change needed.
Also applies to: 174-177
304-319: Nice: update RQ job meta for UI.The meta payload is useful for UI correlation and debugging.
591-604: No import errors detected—analyze_speech is correctly imported and exists.The verification confirms that
analyze_speechis defined inbackends/advanced/src/advanced_omi_backend/utils/conversation_utils.py(line 49) and is correctly imported at line 550 oftranscription_jobs.py. The function is being used as expected in the code snippet. Bothis_meaningful_speechandanalyze_speechexist as separate utilities, indicating intentional design rather than a naming drift.
| conversations = await Conversation.find(Conversation.audio_uuid == session_id).to_list() | ||
| conversation_ids = {conv.conversation_id for conv in conversations} | ||
|
|
||
| all_jobs = [] | ||
| processed_job_ids = set() | ||
| queues = ["default", "transcription", "memory", "audio"] | ||
|
|
There was a problem hiding this comment.
Unused DB query in dashboard session resolver.
conversation_ids are fetched but not used; remove to save IO.
- # Get conversation IDs for this session
- conversations = await Conversation.find(Conversation.audio_uuid == session_id).to_list()
- conversation_ids = {conv.conversation_id for conv in conversations}
+ # Conversation lookup not required for current matching logic📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| conversations = await Conversation.find(Conversation.audio_uuid == session_id).to_list() | |
| conversation_ids = {conv.conversation_id for conv in conversations} | |
| all_jobs = [] | |
| processed_job_ids = set() | |
| queues = ["default", "transcription", "memory", "audio"] | |
| all_jobs = [] | |
| processed_job_ids = set() | |
| queues = ["default", "transcription", "memory", "audio"] |
🧰 Tools
🪛 Ruff (0.14.1)
816-816: Local variable conversation_ids is assigned to but never used
Remove assignment to unused variable conversation_ids
(F841)
🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/routers/modules/queue_routes.py
around lines 815 to 821, the Conversation.find(...) call and the derived
conversation_ids set are unused and cause unnecessary DB I/O; remove the two
lines that execute the query and create conversation_ids (or replace them with a
no-op/comment if future use is intended), and ensure no subsequent code depends
on conversation_ids—if it does, either use the fetched values where needed or
lazily fetch them only when required.
| logger.debug( | ||
| f"Deepgram basic transcription successful: {len(transcript)} characters" | ||
| ) |
There was a problem hiding this comment.
Inconsistent logging level with diarized transcription path.
Line 97-99 logs diarized transcription success at info level, but this logs basic transcription success at debug level. Both represent the same logical outcome (successful transcription) and should use the same log level for consistency.
Consider aligning both to the same level:
- logger.debug(
+ logger.info(
f"Deepgram basic transcription successful: {len(transcript)} characters"
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logger.debug( | |
| f"Deepgram basic transcription successful: {len(transcript)} characters" | |
| ) | |
| logger.info( | |
| f"Deepgram basic transcription successful: {len(transcript)} characters" | |
| ) |
🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/services/transcription/deepgram.py
around lines 131-133, the basic transcription success is logged at debug while
the diarized path logs success at info; change the basic transcription logger
from logger.debug to logger.info so both successful-transcription outcomes use
the same log level, and keep the message text intact (or make it match the
diarized message format) to maintain consistency.
| @@ -0,0 +1,54 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
File has shebang but is not executable.
The file contains a shebang but lacks executable permissions. Either add execute permissions (chmod +x) or remove the shebang if the file is not meant to be directly executable.
🧰 Tools
🪛 Ruff (0.14.1)
1-1: Shebang is present but file is not executable
(EXE001)
🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/workers/rq_worker_entry.py around
line 1, the file contains a shebang (#!/usr/bin/env python3) but the file is not
executable; either remove the shebang if the module is only imported, or make
the file executable and keep the shebang by running chmod +x
backends/advanced/src/advanced_omi_backend/workers/rq_worker_entry.py and
committing that permission change.
| def main(): | ||
| """Start RQ worker with proper logging configuration.""" | ||
| from rq import Worker | ||
| from redis import Redis | ||
|
|
||
| # Get Redis URL from environment | ||
| redis_url = os.getenv('REDIS_URL', 'redis://localhost:6379/0') | ||
|
|
||
| # Get queue names from command line arguments | ||
| queue_names = sys.argv[1:] if len(sys.argv) > 1 else ['transcription', 'memory', 'default'] | ||
|
|
||
| logger.info(f"🚀 Starting RQ worker for queues: {', '.join(queue_names)}") | ||
| logger.info(f"📡 Redis URL: {redis_url}") | ||
|
|
||
| # Create Redis connection | ||
| redis_conn = Redis.from_url(redis_url) | ||
|
|
||
| # Create and start worker | ||
| worker = Worker( | ||
| queue_names, | ||
| connection=redis_conn, | ||
| log_job_description=True | ||
| ) | ||
|
|
||
| logger.info("✅ RQ worker ready") | ||
|
|
||
| # This blocks until worker is stopped | ||
| worker.work(logging_level='INFO') |
There was a problem hiding this comment.
Add error handling and graceful shutdown.
The worker lacks error handling for Redis connection failures and graceful shutdown handling for SIGTERM/SIGINT signals, which are essential for production reliability.
Apply this diff to add resilience:
+import signal
+import sys
+
def main():
"""Start RQ worker with proper logging configuration."""
from rq import Worker
from redis import Redis
# Get Redis URL from environment
redis_url = os.getenv('REDIS_URL', 'redis://localhost:6379/0')
# Get queue names from command line arguments
queue_names = sys.argv[1:] if len(sys.argv) > 1 else ['transcription', 'memory', 'default']
logger.info(f"🚀 Starting RQ worker for queues: {', '.join(queue_names)}")
logger.info(f"📡 Redis URL: {redis_url}")
- # Create Redis connection
- redis_conn = Redis.from_url(redis_url)
+ # Create Redis connection with error handling
+ try:
+ redis_conn = Redis.from_url(redis_url)
+ redis_conn.ping() # Verify connection
+ except Exception as e:
+ logger.error(f"❌ Failed to connect to Redis at {redis_url}: {e}")
+ sys.exit(1)
# Create and start worker
worker = Worker(
queue_names,
connection=redis_conn,
log_job_description=True
)
+
+ # Setup graceful shutdown
+ def signal_handler(signum, frame):
+ logger.info(f"🛑 Received signal {signum}, shutting down gracefully...")
+ worker.request_stop()
+
+ signal.signal(signal.SIGTERM, signal_handler)
+ signal.signal(signal.SIGINT, signal_handler)
logger.info("✅ RQ worker ready")
- # This blocks until worker is stopped
- worker.work(logging_level='INFO')
+ try:
+ # This blocks until worker is stopped
+ worker.work(logging_level='INFO')
+ except Exception as e:
+ logger.error(f"❌ Worker error: {e}", exc_info=True)
+ sys.exit(1)🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/workers/rq_worker_entry.py around
lines 23 to 50, the startup sequence lacks Redis connection error handling and
signal-based graceful shutdown; wrap Redis.from_url and Worker construction in a
try/except that logs the exception with logger.error and exits with a non-zero
code on failure, and register SIGTERM and SIGINT handlers that trigger a clean
worker stop (or set a shutdown Event that calls worker.stop() / worker.close())
so the process can finish current jobs before exiting; ensure handlers are
idempotent, join/wait for worker shutdown if needed, and return/exit cleanly
after shutdown.
| ## Project Overview | ||
|
|
||
| Friend-Lite is at the core an AI-powered personal system - various devices, incuding but not limited to wearables from OMI can be used for at the very least audio capture, speaker specific transcription, memory extraction and retriaval. | ||
| Friend-Lite is at the core an AI-powered personal system - various devices, including but not limited to wearables from OMI can be used for at the very least audio capture, speaker specific transcription, memory extraction and retrieval. |
There was a problem hiding this comment.
Hyphenation for clarity.
Use “speaker-specific” instead of “speaker specific”.
-Friend-Lite is at the core an AI-powered personal system - various devices, including but not limited to wearables from OMI can be used for at the very least audio capture, speaker specific transcription, memory extraction and retrieval.
+Friend-Lite is at the core an AI-powered personal system - various devices, including but not limited to wearables from OMI, can be used for at the very least audio capture, speaker-specific transcription, memory extraction, and retrieval.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Friend-Lite is at the core an AI-powered personal system - various devices, including but not limited to wearables from OMI can be used for at the very least audio capture, speaker specific transcription, memory extraction and retrieval. | |
| Friend-Lite is at the core an AI-powered personal system - various devices, including but not limited to wearables from OMI, can be used for at the very least audio capture, speaker-specific transcription, memory extraction, and retrieval. |
🧰 Tools
🪛 LanguageTool
[grammar] ~7-~7: Use a hyphen to join words.
Context: ...at the very least audio capture, speaker specific transcription, memory extractio...
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents
In CLAUDE.md around line 7, the phrase "speaker specific" should be hyphenated
for clarity; change it to "speaker-specific" so the sentence reads
"...speaker-specific transcription..." and ensure spacing and punctuation remain
correct.
| error_msg = "Invalid credentials" | ||
| try: | ||
| error_data = response.json() | ||
| error_msg = error_data.get('detail', error_msg) | ||
| except: | ||
| pass | ||
| logger.error(f"Authentication failed: {error_msg}") | ||
| return None |
There was a problem hiding this comment.
Avoid bare except; preserve server error detail.
Narrow to JSON errors and keep context.
- try:
- error_data = response.json()
- error_msg = error_data.get('detail', error_msg)
- except:
- pass
+ try:
+ error_data = response.json()
+ error_msg = error_data.get('detail', error_msg)
+ except ValueError as parse_err:
+ logger.debug("Non-JSON error response: %s", parse_err)🧰 Tools
🪛 Ruff (0.14.1)
60-60: Do not use bare except
(E722)
60-61: try-except-pass detected, consider logging the exception
(S110)
🤖 Prompt for AI Agents
extras/local-omi-bt/send_to_adv.py around lines 56 to 63: the current bare
except hides JSON parsing errors and loses server error detail; change the
except to only catch JSON decode errors (e.g., ValueError or
json.JSONDecodeError), and on JSON parse failure preserve and use the raw
response text (and/or include response.status_code) as the error message so the
log contains the server response; also avoid swallowing the exception silently —
include the exception context in the log if available.
| token = await get_jwt_token(ADMIN_EMAIL, ADMIN_PASSWORD) | ||
| if not token: | ||
| logger.error("Failed to get JWT token, cannot stream audio") | ||
| return | ||
|
|
||
| # Connect with JWT token as query parameter | ||
| uri_with_token = f"{websocket_uri}?token={token}&device_name={DEVICE_NAME}" | ||
|
|
There was a problem hiding this comment.
Guard missing ADMIN_EMAIL/ADMIN_PASSWORD early.
Fail fast with a clear log before attempting auth.
- # Get JWT token for authentication
- token = await get_jwt_token(ADMIN_EMAIL, ADMIN_PASSWORD)
+ # Get JWT token for authentication
+ if not ADMIN_EMAIL or not ADMIN_PASSWORD:
+ logger.error("ADMIN_EMAIL/ADMIN_PASSWORD not set in environment; cannot authenticate")
+ return
+ token = await get_jwt_token(ADMIN_EMAIL, ADMIN_PASSWORD)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In extras/local-omi-bt/send_to_adv.py around lines 84 to 91, the code calls
get_jwt_token without first verifying ADMIN_EMAIL and ADMIN_PASSWORD are set;
add an early guard that checks these variables (e.g., environment or config
values) and if either is missing log a clear error like "ADMIN_EMAIL or
ADMIN_PASSWORD not set, cannot obtain JWT" and return/exit immediately before
attempting authentication; proceed to call get_jwt_token only when both are
present.
| @@ -0,0 +1 @@ | |||
| uv run --with-requirements requirements.txt python connect-laptop.py | |||
There was a problem hiding this comment.
Add a shebang for portability.
The script lacks a shebang line, which can cause portability issues depending on how it's invoked.
Apply this diff:
+#!/bin/bash
uv run --with-requirements requirements.txt python connect-laptop.py📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| uv run --with-requirements requirements.txt python connect-laptop.py | |
| #!/bin/bash | |
| uv run --with-requirements requirements.txt python connect-laptop.py |
🧰 Tools
🪛 Shellcheck (0.11.0)
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
🤖 Prompt for AI Agents
In extras/local-omi-bt/start.sh around lines 1 to 1, the script is missing a
shebang which can cause portability issues when executed; add a shebang as the
very first line (for example: use /usr/bin/env bash or /usr/bin/env sh depending
on shell features used) so the system knows which interpreter to invoke, leaving
the existing command line unchanged and ensure the file remains executable.
| self.console.print() | ||
| self.console.print("📋 [bold]HTTPS Mode URLs:[/bold]") | ||
| self.console.print(f" 🌐 HTTPS Access: https://localhost:8444/") | ||
| self.console.print(f" 🌐 HTTP Redirect: http://localhost:8081/ → HTTPS") | ||
| self.console.print(f" 📱 Service API: https://localhost:8444/api/") | ||
| self.console.print(f" 💡 Accept SSL certificate in browser") | ||
| else: |
There was a problem hiding this comment.
Static localhost URLs + extraneous f-strings.
Use the chosen server_ip and drop useless f-prefixes.
- self.console.print()
- self.console.print("📋 [bold]HTTPS Mode URLs:[/bold]")
- self.console.print(f" 🌐 HTTPS Access: https://localhost:8444/")
- self.console.print(f" 🌐 HTTP Redirect: http://localhost:8081/ → HTTPS")
- self.console.print(f" 📱 Service API: https://localhost:8444/api/")
- self.console.print(f" 💡 Accept SSL certificate in browser")
+ self.console.print()
+ self.console.print("📋 [bold]HTTPS Mode URLs:[/bold]")
+ self.console.print(f" 🌐 HTTPS Access: https://{server_ip}:8444/")
+ self.console.print(f" 🌐 HTTP Redirect: http://{server_ip}:8081/ → HTTPS")
+ self.console.print(f" 📱 Service API: https://{server_ip}:8444/api/")
+ self.console.print(" 💡 Accept SSL certificate in browser")🧰 Tools
🪛 Ruff (0.14.1)
211-211: f-string without any placeholders
Remove extraneous f prefix
(F541)
212-212: f-string without any placeholders
Remove extraneous f prefix
(F541)
213-213: f-string without any placeholders
Remove extraneous f prefix
(F541)
214-214: f-string without any placeholders
Remove extraneous f prefix
(F541)
| self.console.print(f"✅ HF Token: {'Configured' if self.config.get('HF_TOKEN') else 'Not configured'}") | ||
| self.console.print(f"✅ Compute Mode: {self.config.get('COMPUTE_MODE', 'Not configured')}") | ||
| self.console.print(f"✅ HTTPS Enabled: {self.config.get('REACT_UI_HTTPS', 'false')}") | ||
| if self.config.get('DEEPGRAM_API_KEY'): | ||
| self.console.print(f"✅ Deepgram API Key: Configured") | ||
|
|
There was a problem hiding this comment.
Extraneous f-string.
No placeholders; remove f-prefix.
- if self.config.get('DEEPGRAM_API_KEY'):
- self.console.print(f"✅ Deepgram API Key: Configured")
+ if self.config.get('DEEPGRAM_API_KEY'):
+ self.console.print("✅ Deepgram API Key: Configured")🧰 Tools
🪛 Ruff (0.14.1)
265-265: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
In extras/speaker-recognition/init.py around lines 261 to 266, several print
statements use extraneous f-strings despite containing no placeholders; remove
the f-prefix and convert them to normal string literals (e.g., change f"✅ HF
Token: {...}" to "✅ HF Token: ..." while still interpolating values by
concatenation or format only where needed) so the lines print the same text
without using f-strings unnecessarily.
Summary by CodeRabbit
New Features
Documentation
Improvements