Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,12 @@ transcription_results.csv
**/hub/*
*.log
**/speaker_data/**
**/.venv/*
**/.venv/*
**metrics_report**

*.db
**/advanced_omi_backend.egg-info/
**/dist/*
**/build/*

untracked/*
169 changes: 169 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

Friend-Lite is an AI-powered wearable ecosystem for audio capture, transcription, memory extraction, and action item detection. The system features real-time audio streaming from OMI devices via Bluetooth, intelligent conversation processing, and a comprehensive web dashboard for management.

## Development Commands

### Backend Development (Advanced Backend - Primary)
```bash
cd backends/advanced-backend

# Start full stack with Docker
docker compose up --build -d

# Development with live reload
uv run python src/main.py

# Code formatting and linting
uv run black src/
uv run isort src/

# Run tests
uv run pytest
uv run pytest tests/test_memory_service.py # Single test file
uv run pytest test_endpoints.py # Integration tests
uv run pytest test_failure_recovery.py # Failure recovery tests
uv run pytest test_memory_debug.py # Memory debug tests

# Environment setup
cp .env.template .env # Configure environment variables

# Reset data (development)
sudo rm -rf ./audio_chunks/ ./mongo_data/ ./qdrant_data/
```

### Mobile App Development
```bash
cd friend-lite

# Start Expo development server
npm start

# Platform-specific builds
npm run android
npm run ios
npm run web
```

### Additional Services
```bash
# ASR Services
cd extras/asr-services
docker compose up moonshine # Offline ASR with Moonshine
docker compose up parakeet # Offline ASR with Parakeet

# Speaker Recognition
cd extras/speaker-recognition
docker compose up --build

# HAVPE Relay (ESP32 bridge)
cd extras/havpe-relay
docker compose up --build
```

## Architecture Overview

### Core Structure
- **backends/advanced-backend/**: Primary FastAPI backend with real-time audio processing
- `src/main.py`: Central FastAPI application with WebSocket audio streaming
- `src/auth.py`: Email-based authentication with JWT tokens
- `src/memory/`: LLM-powered conversation memory system using mem0
- `src/failure_recovery/`: Robust processing pipeline with SQLite tracking
- `webui/streamlit_app.py`: Web dashboard for conversation and user management

### Key Components
- **Audio Pipeline**: Real-time Opus/PCM → Deepgram WebSocket transcription → memory extraction
- **Transcription**: Deepgram Nova-3 model with Wyoming ASR fallback, auto-reconnection
- **Authentication**: Email-based login with MongoDB ObjectId user system
- **Client Management**: Auto-generated client IDs as `{user_id_suffix}-{device_name}`, centralized ClientManager
- **Data Storage**: MongoDB (conversations), Qdrant (vector memory), SQLite (failure recovery)
- **Web Interface**: Streamlit dashboard with authentication and real-time monitoring

### Service Dependencies
```yaml
Required:
- MongoDB: User data and conversations
- FastAPI Backend: Core audio processing

Recommended:
- Qdrant: Vector storage for semantic memory
- Ollama: LLM for memory extraction and action items
- Deepgram: Primary transcription service (Nova-3 WebSocket)
- Wyoming ASR: Fallback transcription service (offline)

Optional:
- Speaker Recognition: Voice identification service
- Nginx Proxy: Load balancing and routing
```

## Data Flow Architecture

1. **Audio Ingestion**: OMI devices stream Opus audio via WebSocket with JWT auth
2. **Real-time Processing**: Per-client queues handle transcription and buffering
3. **Conversation Management**: Automatic timeout-based conversation segmentation
4. **Memory Extraction**: LLM processes completed conversations for semantic storage
5. **Action Items**: Automatic task detection with "Simon says" trigger phrases
6. **Audio Optimization**: Speech segment extraction removes silence automatically

## Authentication & Security

- **User System**: Email-based authentication with MongoDB ObjectId user IDs
- **Client Registration**: Automatic `{objectid_suffix}-{device_name}` format
- **Data Isolation**: All data scoped by user_id with efficient permission checking
- **API Security**: JWT tokens required for all endpoints and WebSocket connections
- **Admin Bootstrap**: Automatic admin account creation with ADMIN_EMAIL/ADMIN_PASSWORD

## Configuration

### Required Environment Variables
```bash
AUTH_SECRET_KEY=your-super-secret-jwt-key-here
ADMIN_PASSWORD=your-secure-admin-password
ADMIN_EMAIL=admin@example.com
```

### Optional Service Configuration
```bash
# Transcription (Deepgram primary, Wyoming fallback)
DEEPGRAM_API_KEY=your-deepgram-key-here
OFFLINE_ASR_TCP_URI=tcp://host.docker.internal:8765

# LLM Processing
OLLAMA_BASE_URL=http://ollama:11434

# Vector Storage
QDRANT_BASE_URL=qdrant

# Speaker Recognition
SPEAKER_SERVICE_URL=http://speaker-recognition:8001
```

## Development Notes

### Package Management
- **Backend**: Uses `uv` for Python dependency management (faster than pip)
- **Mobile**: Uses `npm` with React Native and Expo
- **Docker**: Primary deployment method with docker-compose

### Testing Strategy
- **Integration Tests**: `test_endpoints.py` covers API functionality
- **Unit Tests**: Individual service tests in `tests/` directory
- **System Tests**: `test_failure_recovery.py` and `test_memory_debug.py`

### Code Style
- **Python**: Black formatter with 100-character line length, isort for imports
- **TypeScript**: Standard React Native conventions

### Health Monitoring
The system includes comprehensive health checks:
- `/readiness`: Service dependency validation
- `/health`: Basic application status
- Failure recovery system with SQLite tracking
- Memory debug system for transcript processing monitoring

### Cursor Rule Integration
Project includes `.cursor/rules/always-plan-first.mdc` requiring understanding before coding. Always explain the task and confirm approach before implementation.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ Choose one based on your needs:
- Requires multiple services (MongoDB, Qdrant, Ollama)
- Higher resource requirements
- Steeper learning curve
- Authentication setup required

---

Expand Down
4 changes: 3 additions & 1 deletion backends/advanced-backend/.dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,6 @@
!pyproject.toml
!pyproject.blackwell.toml
!README.md
!src
!src
!.env
!memory_config.yaml
76 changes: 70 additions & 6 deletions backends/advanced-backend/.env.template
Original file line number Diff line number Diff line change
@@ -1,6 +1,70 @@
OFFLINE_ASR_TCP_URI=
OLLAMA_BASE_URL=
NGROK_AUTHTOKEN=
HF_TOKEN=
SPEAKER_SERVICE_URL=
MONGODB_URI=
# This key is used to sign your JWT token, just make it random and long
AUTH_SECRET_KEY=

# This is the password for the admin user
ADMIN_PASSWORD=

# Admin email (defaults to admin@example.com if not set)
ADMIN_EMAIL=admin@example.com

# ========================================
# LLM CONFIGURATION (Choose one)
# ========================================

# LLM Provider: "openai" or "ollama" (default: ollama)
LLM_PROVIDER=openai

# For OpenAI (recommended for best memory extraction)
OPENAI_API_KEY=
OPENAI_MODEL=gpt-4o

# For Ollama (local LLM)
OLLAMA_BASE_URL=http://ollama:11434
# OLLAMA_MODEL=gemma3n:e4b

# ========================================
# SPEECH-TO-TEXT CONFIGURATION (Choose one)
# ========================================

# Option 1: Deepgram (recommended for best transcription quality)
DEEPGRAM_API_KEY=

# Option 2: Local ASR service from extras/asr-services
# OFFLINE_ASR_TCP_URI=tcp://localhost:8765

# ========================================
# DATABASE CONFIGURATION
# ========================================

# MongoDB for conversations and user data (defaults to mongodb://mongo:27017)
MONGODB_URI=mongodb://mongo:27017

# Qdrant for vector memory storage (defaults to qdrant)
QDRANT_BASE_URL=qdrant

# ========================================
# OPTIONAL FEATURES
# ========================================

# Debug directory for troubleshooting
DEBUG_DIR=./debug_dir

# Ngrok for external access (if using ngrok from docker-compose)
# NGROK_AUTHTOKEN=

# Speaker recognition service
# HF_TOKEN=
# SPEAKER_SERVICE_URL=http://speaker-recognition:8001

# Audio processing settings
# NEW_CONVERSATION_TIMEOUT_MINUTES=1.5
# AUDIO_CROPPING_ENABLED=true
# MIN_SPEECH_SEGMENT_DURATION=1.0
# CROPPING_CONTEXT_PADDING=0.1

# Server settings
# HOST=0.0.0.0
# PORT=8000

# Memory settings
# MEM0_TELEMETRY=False
18 changes: 11 additions & 7 deletions backends/advanced-backend/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,21 @@ COPY --from=ghcr.io/astral-sh/uv:0.6.10 /uv /uvx /bin/
# Set up the working directory
WORKDIR /app

# Copy dependency files
COPY pyproject.toml .
# Copy package structure and dependency files first
COPY pyproject.toml README.md ./
RUN mkdir -p src/advanced_omi_backend
COPY src/advanced_omi_backend/__init__.py src/advanced_omi_backend/

# Install dependencies using uv
# Install dependencies using uv with deepgram extra
RUN --mount=type=cache,target=/root/.cache/uv \
uv sync
uv sync --extra deepgram


# Copy application code
# Copy all application code
COPY . .

# Copy memory config to the expected location
COPY memory_config.yaml src/


# Run the application
CMD ["uv", "run", "python3", "src/main.py"]
CMD ["uv", "run", "python3", "src/advanced_omi_backend/main.py"]
1 change: 1 addition & 0 deletions backends/advanced-backend/Dockerfile.blackwell
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ WORKDIR /app

# Copy dependency files
COPY pyproject.blackwell.toml pyproject.toml
COPY README.md .

# Install dependencies using uv
RUN --mount=type=cache,target=/root/.cache/uv \
Expand Down
24 changes: 24 additions & 0 deletions backends/advanced-backend/Dockerfile.webui
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
FROM python:3.11-slim

WORKDIR /app

# Install uv
COPY --from=ghcr.io/astral-sh/uv:0.6.10 /uv /uvx /bin/

# Copy pyproject.toml and README.md from the current directory (advanced-backend)
COPY pyproject.toml README.md ./

# Copy the entire src directory to make advanced_omi_backend package available
RUN mkdir -p src/advanced_omi_backend
COPY src/advanced_omi_backend/__init__.py src/advanced_omi_backend/

# Install dependencies using uv with webui extra
RUN --mount=type=cache,target=/root/.cache/uv \
uv sync --extra webui

# Set PYTHONPATH so imports work
COPY src/ /app/src/
ENV PYTHONPATH=/app/src

CMD ["uv", "run", "streamlit", "run", "src/webui/streamlit_app.py", \
"--server.address=0.0.0.0", "--server.port=8501"]
Loading