SimpleOpenSoftware · AnkushMalaker · Aug 28, 2025 · Aug 20, 2025 · Aug 23, 2025 · Aug 23, 2025
diff --git a/.gitignore b/.gitignore
@@ -2,6 +2,8 @@
 *.wav
 **/*.env
 !**/.env.template
+**/memory_config.yaml
+!**/memory_config.yaml.template
-**/memory_config.yaml
-!**/memory_config.yaml.template
+**/memory_config.yaml
+!backends/advanced/memory_config.yaml
+!**/memory_config.yaml.template
-**/memory_config.yaml
-!**/memory_config.yaml.template
+**/memory_config.yaml
+!backends/advanced/memory_config.yaml
+!**/memory_config.yaml.template
 example/*
 **/node_modules/*
 **/ollama-data/*
@@ -58,4 +60,8 @@ extras/speaker-recognition/outputs/*
 # my backup
 backends/advanced/src/_webui_original/*
 backends/advanced-backend/data/neo4j_data/*
-backends/advanced-backend/data/speaker_model_cache/
+backends/advanced-backend/data/speaker_model_cache/
+
-backends/advanced-backend/data/speaker_model_cache/
+ backends/advanced/data/speaker_model_cache/
-backends/advanced-backend/data/speaker_model_cache/
+ backends/advanced/data/speaker_model_cache/
+*.bin
+*.sqlite3
+*checkpoints
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -144,14 +144,15 @@ docker compose up --build
   - `webui/`: React-based web dashboard for conversation and user management
 
 ### Key Components
-- **Audio Pipeline**: Real-time Opus/PCM → Application-level processing → Deepgram transcription → memory extraction
+- **Audio Pipeline**: Real-time Opus/PCM → Application-level processing → Deepgram/Mistral transcription → memory extraction
 - **Wyoming Protocol**: WebSocket communication uses Wyoming protocol (JSONL + binary) for structured audio sessions
 - **Application-Level Processing**: Centralized processors for audio, transcription, memory, and cropping
 - **Task Management**: BackgroundTaskManager tracks all async tasks to prevent orphaned processes
-- **Unified Transcription**: Deepgram transcription with fallback to offline ASR services
+- **Unified Transcription**: Deepgram/Mistral transcription with fallback to offline ASR services
+- **Memory System**: Pluggable providers (Friend-Lite native or OpenMemory MCP)
 - **Authentication**: Email-based login with MongoDB ObjectId user system
 - **Client Management**: Auto-generated client IDs as `{user_id_suffix}-{device_name}`, centralized ClientManager
-- **Data Storage**: MongoDB (`audio_chunks` collection for conversations), Qdrant (vector memory)
+- **Data Storage**: MongoDB (`audio_chunks` collection for conversations), vector storage (Qdrant or OpenMemory)
 - **Web Interface**: React-based web dashboard with authentication and real-time monitoring
 
 ### Service Dependencies
@@ -162,13 +163,14 @@ Required:
   - LLM Service: Memory extraction and action items (OpenAI or Ollama)
 
 Recommended:
-  - Qdrant: Vector storage for semantic memory
-  - Deepgram: Primary transcription service (Nova-3 WebSocket)
+  - Vector Storage: Qdrant (Friend-Lite provider) or OpenMemory MCP server
+  - Transcription: Deepgram, Mistral, or offline ASR services
 
 Optional:
   - Parakeet ASR: Offline transcription service
   - Speaker Recognition: Voice identification service
   - Nginx Proxy: Load balancing and routing
+  - OpenMemory MCP: For cross-client memory compatibility
 ```
 
 ## Data Flow Architecture
@@ -178,10 +180,11 @@ Optional:
 3. **Application-Level Processing**: Global queues and processors handle all audio/transcription/memory tasks
 4. **Conversation Storage**: Transcripts saved to MongoDB `audio_chunks` collection with segments array
 5. **Conversation Management**: Session-based conversation segmentation using Wyoming protocol events
-6. **Memory Extraction**: Background LLM processing (decoupled from conversation storage)
-7. **Action Items**: Automatic task detection with "Simon says" trigger phrases
-8. **Audio Optimization**: Speech segment extraction removes silence automatically
-9. **Task Tracking**: BackgroundTaskManager ensures proper cleanup of all async operations
+6. **Memory Processing**: Pluggable providers (Friend-Lite native with individual facts or OpenMemory MCP delegation)
+7. **Memory Storage**: Direct Qdrant (Friend-Lite) or OpenMemory server (MCP provider)
+8. **Action Items**: Automatic task detection with "Simon says" trigger phrases
+9. **Audio Optimization**: Speech segment extraction removes silence automatically
+10. **Task Tracking**: BackgroundTaskManager ensures proper cleanup of all async operations
 
 ### Database Schema Details
 - **Conversations**: Stored in `audio_chunks` collection (not `conversations`)
@@ -210,13 +213,16 @@ ADMIN_EMAIL=admin@example.com
 LLM_PROVIDER=openai  # or ollama
 OPENAI_API_KEY=your-openai-key-here
 OPENAI_BASE_URL=https://api.openai.com/v1
-OPENAI_MODEL=gpt-4o
+OPENAI_MODEL=gpt-4o-mini
 
 # Speech-to-Text
 DEEPGRAM_API_KEY=your-deepgram-key-here
 # Optional: PARAKEET_ASR_URL=http://host.docker.internal:8767
 # Optional: TRANSCRIPTION_PROVIDER=deepgram
 
+# Memory Provider (New)
+MEMORY_PROVIDER=friend_lite  # or openmemory_mcp
+
 # Database
 MONGODB_URI=mongodb://mongo:27017
 QDRANT_BASE_URL=qdrant
@@ -228,16 +234,136 @@ WEBUI_PORT=5173
 CORS_ORIGINS=http://localhost:3000,http://localhost:5173
 ```
 
-### Transcription Provider Configuration
+### Memory Provider Configuration
+
+Friend-Lite now supports two pluggable memory backends:
+
+#### Friend-Lite Memory Provider (Default)
 ```bash
-# Primary transcription provider
-DEEPGRAM_API_KEY=your-deepgram-key-here     # Primary transcription service
+# Use Friend-Lite memory provider (default)
+MEMORY_PROVIDER=friend_lite
 
-# LLM Processing
-OLLAMA_BASE_URL=http://ollama:11434
+# LLM Configuration for memory extraction
+LLM_PROVIDER=openai
+OPENAI_API_KEY=your-openai-key-here
+OPENAI_MODEL=gpt-4o-mini
 
 # Vector Storage
 QDRANT_BASE_URL=qdrant
+```
+
+#### OpenMemory MCP Provider
+```bash
+# Use OpenMemory MCP provider
+MEMORY_PROVIDER=openmemory_mcp
+
+# OpenMemory MCP Server Configuration
+OPENMEMORY_MCP_URL=http://host.docker.internal:8765
+OPENMEMORY_CLIENT_NAME=friend_lite
+OPENMEMORY_USER_ID=openmemory
+OPENMEMORY_TIMEOUT=30
+
+# OpenAI key for OpenMemory server
+OPENAI_API_KEY=your-openai-key-here
+```
+
+#### OpenMemory MCP Interface Patterns
+
+**Important**: OpenMemory MCP stores memories **per-app**, not globally. Understanding this architecture is critical for proper integration.
+
+**App-Based Storage Architecture:**
+- All memories are stored under specific "apps" (namespaces)
+- Generic endpoints (`/api/v1/memories/`) return empty results
+- App-specific endpoints (`/api/v1/apps/{app_id}/memories`) contain the actual memories
+
+**Hardcoded Values and Configuration:**
+```bash
+# Default app name (configurable via OPENMEMORY_CLIENT_NAME)
+Default: "friend_lite"
+
+# Hardcoded metadata (NOT configurable)
+"source": "friend_lite"  # Always hardcoded in Friend-Lite
+
+# User ID for OpenMemory MCP server
+OPENMEMORY_USER_ID=openmemory  # Configurable
+```
+
+**API Interface Pattern:**
+```python
+# 1. App Discovery - Find app by client_name
+GET /api/v1/apps/
+# Response: {"apps": [{"id": "uuid", "name": "friend_lite", ...}]}
+
+# 2. Memory Creation - Uses generic endpoint but assigns to app
+POST /api/v1/memories/
+{
+  "user_id": "openmemory",
+  "text": "memory content",
+  "app": "friend_lite",  # Uses OPENMEMORY_CLIENT_NAME
+  "metadata": {
+    "source": "friend_lite",    # Hardcoded
+    "client": "friend_lite"     # Uses OPENMEMORY_CLIENT_NAME
+  }
+}
+
+# 3. Memory Retrieval - Must use app-specific endpoint
+GET /api/v1/apps/{app_id}/memories?user_id=openmemory&page=1&size=10
+
+# 4. Memory Search - Must use app-specific endpoint with search_query
+GET /api/v1/apps/{app_id}/memories?user_id=openmemory&search_query=keyword&page=1&size=10
+```
+
+**Friend-Lite Integration Flow:**
+1. **App Discovery**: Query `/api/v1/apps/` to find app matching `OPENMEMORY_CLIENT_NAME`
+2. **Fallback**: If client app not found, use first available app
+3. **Operations**: All memory operations use the app-specific endpoints with discovered `app_id`
+
+**Testing OpenMemory MCP Integration:**
+```bash
+# Configure .env file with OpenMemory MCP settings
+cp .env.template .env
+# Edit .env to set MEMORY_PROVIDER=openmemory_mcp and configure OPENMEMORY_* variables
+
+# Start OpenMemory MCP server
+cd extras/openmemory-mcp && docker compose up -d
+
+# Run integration tests (reads configuration from .env file)
+cd backends/advanced && ./run-test.sh
+
+# Manual testing - Check app structure
+curl -s "http://localhost:8765/api/v1/apps/" | jq
+
+# Test memory creation
+curl -X POST "http://localhost:8765/api/v1/memories/" \
+  -H "Content-Type: application/json" \
+  -d '{"user_id": "openmemory", "text": "test memory", "app": "friend_lite"}'
+
+# Retrieve memories (replace app_id with actual ID from apps endpoint)
+curl -s "http://localhost:8765/api/v1/apps/{app_id}/memories?user_id=openmemory" | jq
+```
+
+### Transcription Provider Configuration
+
+Friend-Lite supports multiple transcription services:
+
+```bash
+# Option 1: Deepgram (High quality, recommended)
+TRANSCRIPTION_PROVIDER=deepgram
+DEEPGRAM_API_KEY=your-deepgram-key-here
+
+# Option 2: Mistral (Voxtral models)
+TRANSCRIPTION_PROVIDER=mistral
+MISTRAL_API_KEY=your-mistral-key-here
+MISTRAL_MODEL=voxtral-mini-2507
+
+# Option 3: Local ASR (Parakeet)
+PARAKEET_ASR_URL=http://host.docker.internal:8767
+```
+
+### Additional Service Configuration
+```bash
+# LLM Processing
+OLLAMA_BASE_URL=http://ollama:11434
 
 # Speaker Recognition
 SPEAKER_SERVICE_URL=http://speaker-recognition:8001
@@ -246,10 +372,11 @@ SPEAKER_SERVICE_URL=http://speaker-recognition:8001
 ## Transcription Architecture
 
 ### Provider System
-Friend-Lite uses Deepgram as the primary transcription provider with support for offline ASR services:
+Friend-Lite supports multiple transcription providers:
 
-**Online Provider (API-based):**
-- **Deepgram**: Primary transcription service using Nova-3 model with real-time streaming
+**Online Providers (API-based):**
+- **Deepgram**: High-quality transcription using Nova-3 model with real-time streaming
+- **Mistral**: Voxtral models for transcription with REST API processing
 
 **Offline Providers (Local processing):**
 - **Parakeet**: Local speech recognition service available in extras/asr-services
@@ -341,6 +468,110 @@ websocket.send(JSON.stringify(audioStop) + '\n');
 - **Future Extensibility**: Room for additional event types (pause, resume, metadata)
 - **Backward Compatibility**: Works with existing raw audio streaming clients
 
+## Memory System Architecture
+
+### Overview
+Friend-Lite supports two pluggable memory backends that can be selected via configuration:
+
+#### 1. Friend-Lite Memory Provider (`friend_lite`)
+The sophisticated in-house memory implementation with full control and customization:
+
+**Features:**
+- Custom LLM-powered memory extraction with enhanced prompts
+- Individual fact storage (no JSON blobs)
+- Smart deduplication algorithms
+- Intelligent memory updates (ADD/UPDATE/DELETE decisions)
+- Direct Qdrant vector storage
+- Custom memory prompts and processing
+- No external dependencies
+
+**Architecture Flow:**
+1. **Audio Input** → Transcription via Deepgram/Parakeet
+2. **Memory Extraction** → LLM processes transcript using custom prompts
+3. **Fact Parsing** → XML/JSON parsing into individual memory entries
+4. **Deduplication** → Smart algorithms prevent duplicate memories
+5. **Vector Storage** → Direct Qdrant storage with embeddings
+6. **Memory Updates** → LLM-driven action proposals (ADD/UPDATE/DELETE)
+
+#### 2. OpenMemory MCP Provider (`openmemory_mcp`)
+Thin client that delegates all memory processing to external OpenMemory MCP server:
+
+**Features:**
+- Professional memory extraction (handled by OpenMemory)
+- Battle-tested deduplication (handled by OpenMemory)
+- Semantic vector search (handled by OpenMemory)
+- ACL-based user isolation (handled by OpenMemory)
+- Cross-client compatibility (Claude Desktop, Cursor, Windsurf)
+- Web UI for memory management at http://localhost:8765
+
+**Architecture Flow:**
+1. **Audio Input** → Transcription via Deepgram/Parakeet
+2. **MCP Delegation** → Send enriched transcript to OpenMemory MCP server
+3. **External Processing** → OpenMemory handles extraction, deduplication, storage
+4. **Result Mapping** → Convert MCP results to Friend-Lite MemoryEntry format
+5. **Client Management** → Automatic user context switching via MCP client
+
+### Memory Provider Comparison
+
+| Feature | Friend-Lite | OpenMemory MCP |
+|---------|-------------|----------------|
+| **Processing** | Custom LLM extraction | Delegates to OpenMemory |
+| **Deduplication** | Custom algorithms | OpenMemory handles |
+| **Vector Storage** | Direct Qdrant | OpenMemory handles |
+| **Dependencies** | Qdrant + MongoDB | External OpenMemory server |
+| **Customization** | Full control | Limited to OpenMemory features |
+| **Cross-client** | Friend-Lite only | Works with Claude Desktop, Cursor, etc |
+| **Web UI** | Friend-Lite WebUI | OpenMemory UI + Friend-Lite WebUI |
+| **Memory Format** | Individual facts | OpenMemory format |
+| **Setup Complexity** | Medium | High (external server required) |
+
+### Switching Memory Providers
+
+You can switch providers by changing the `MEMORY_PROVIDER` environment variable:
+
+```bash
+# Switch to OpenMemory MCP
+echo "MEMORY_PROVIDER=openmemory_mcp" >> .env
+
+# Switch back to Friend-Lite
+echo "MEMORY_PROVIDER=friend_lite" >> .env
+```
+
+**Note:** Existing memories are not automatically migrated between providers. Each provider maintains its own memory storage.
+
+### OpenMemory MCP Setup
+
+To use the OpenMemory MCP provider:
+
+```bash
+# 1. Start external OpenMemory MCP server
+cd extras/openmemory-mcp
+docker compose up -d
+
+# 2. Configure Friend-Lite to use OpenMemory MCP
+cd backends/advanced
+echo "MEMORY_PROVIDER=openmemory_mcp" >> .env
+
+# 3. Start Friend-Lite backend
+docker compose up --build -d
+```
+
+### When to Use Each Provider
+
+**Use Friend-Lite when:**
+- You want full control over memory processing
+- You need custom memory extraction logic
+- You prefer fewer external dependencies
+- You want to customize memory prompts and algorithms
+- You need individual fact-based memory storage
+
+**Use OpenMemory MCP when:**
+- You want professional, battle-tested memory processing
+- You need cross-client compatibility (Claude Desktop, Cursor, etc.)
+- You prefer to leverage external expertise rather than maintain custom logic
+- You want access to OpenMemory's web interface
+- You're already using OpenMemory in other tools
+
 ## Development Notes
 
 ### Package Management