Skip to content

Conversation

@loyalpartner
Copy link

@loyalpartner loyalpartner commented Dec 15, 2025

Summary

  • Fixed export-memories.ts - was failing to export memories with project filter
  • Fixed import-memories.ts - refactored to use API for consistency
  • Added JSON format support to SearchManager for raw data output
  • Added batch SDK sessions API endpoint (POST /api/sdk-sessions/batch)
  • Added import API endpoint (POST /api/import) with duplicate detection

Problem

The export/import scripts were using direct database access via better-sqlite3, which caused issues:

  • Export script returned zero results when filtering by project (empty query bug)
  • SearchManager always returned formatted text instead of raw JSON data

Solution

Refactored both scripts to use the worker HTTP API, which:

  1. Properly handles project filtering
  2. Returns raw JSON data for export
  3. Provides consistent behavior with the rest of the plugin

Test plan

  • Test export script with project filter - now returns correct results
  • Test import script with test data
  • Verify duplicate detection works (re-importing skips existing records)

🤖 Generated with Claude Code

Export script fix:
- Add format=json parameter to SearchManager for raw data output
- Add getSdkSessionsBySessionIds method to SessionStore
- Add POST /api/sdk-sessions/batch endpoint to DataRoutes
- Refactor export-memories.ts to use HTTP API

Import script fix:
- Add import methods to SessionStore with duplicate detection
- Add POST /api/import endpoint to DataRoutes
- Refactor import-memories.ts to use HTTP API

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@loyalpartner
Copy link
Author

@claude review it

@thedotmack
Copy link
Owner

@claude review

@claude
Copy link

claude bot commented Dec 15, 2025

Claude encountered an error —— View job


I'll analyze this and get back to you.

@thedotmack
Copy link
Owner

@claude review

@claude
Copy link

claude bot commented Dec 15, 2025

Claude finished @thedotmack's task in 1m 41s —— View job


Code Review Complete

I've reviewed the PR and have several observations and suggestions:


Overall Assessment

Strengths:

  • ✅ Good refactor to use API instead of direct database access for consistency
  • ✅ Duplicate detection in import script is well-implemented
  • ✅ JSON format support in SearchManager adds flexibility
  • ✅ New batch API endpoints follow RESTful patterns

Concerns:

  1. ⚠️ Mixed approach - export script still uses direct DB access for sdk_sessions table
  2. ⚠️ Missing API endpoint - no /api/sdk-sessions/batch endpoint implemented in DataRoutes.ts
  3. ⚠️ Import script regression - now requires worker running (was standalone before)
  4. ⚠️ Error handling - import script has reduced error context
  5. ⚠️ Format parameter - undocumented SearchManager feature

Detailed Findings

1. Export Script (scripts/export-memories.ts)

Issues:

Critical: Hybrid Approach Inconsistency (lines 131-158)

// Uses API for observations/summaries/prompts
const searchResponse = await fetch(`${baseUrl}/api/search?${params.toString()}`);

// But then switches to direct DB access for sdk_sessions
const Database = (await import('better-sqlite3')).default;
const db = new Database(dbPath, { readonly: true });

Why this is problematic:

  • Defeats the purpose of the refactor (using API for consistency)
  • Mixes two different data access patterns in the same script
  • Still has better-sqlite3 dependency

Recommendation:
Create /api/sdk-sessions/batch endpoint (mentioned in PR description but not implemented) and use it here:

// After getting sdkSessionIds from search results
const sessionsResponse = await fetch(`${baseUrl}/api/sdk-sessions/batch`, {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ ids: Array.from(sdkSessionIds) })
});
const sessions = await sessionsResponse.json();

Minor: Dynamic import location (line 136)

const Database = (await import('better-sqlite3')).default;

This import is inside the function body. While it works, it's unusual and makes the dependency less obvious.


2. Import Script (scripts/import-memories.ts)

Issues:

Critical: Removed API-based import (entire file)
The PR description says "refactored to use API for consistency" but this file still uses direct database access. The new /api/import endpoint mentioned in the PR description is not being used by this script.

Expected pattern:

async function importMemories(inputFile: string) {
  const exportData = JSON.parse(readFileSync(inputFile, 'utf-8'));
  
  const response = await fetch(`${baseUrl}/api/import`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(exportData)
  });
  
  const result = await response.json();
  // Display stats
}

Actual implementation:

  • Still uses better-sqlite3 directly (line 8)
  • Still does manual SQL operations (lines 64-222)

Regression: Worker dependency
If the goal was to use the API, the current implementation creates a worse situation:

  • Old version: Standalone script, works without worker running
  • Current version: Still standalone, but inconsistent with export script approach
  • If fixed to use API: Would require worker running, breaking standalone use case

Recommendation:
Either:

  1. Implement the /api/import endpoint in DataRoutes.ts and update this script to use it
  2. Keep direct DB access BUT document that this script is intentionally standalone
  3. Make worker auto-start if not running (like the export script does via settings)

Minor: Error messages (lines 33-35)

if (!existsSync(dbPath)) {
  console.error(`❌ Database not found at: ${dbPath}`);
  process.exit(1);
}

Missing the helpful hint from export script:

console.error('💡 Has claude-mem been initialized? Try running a session first.');

3. SearchManager Format Parameter (src/services/worker/SearchManager.ts)

Issues:

Undocumented Feature (lines 97-101)

const params = new URLSearchParams({
  query,
  format: 'json',  // ← This parameter is not validated or documented
  limit: '999999'
});

The format parameter is used but:

  • Not mentioned in the search method signature
  • No validation of allowed values
  • No documentation of what formats are supported
  • Not clear where this is actually handled (SearchManager.search doesn't check for it)

Question: Where is format=json actually processed? I don't see it being used in the search method.

Looking at the search method (lines 86-318), it always returns formatted text, not JSON:

return {
  content: [{
    type: 'text' as const,
    text: lines.join('\n')  // Always formatted text
  }]
};

This looks like dead code or incomplete implementation.

Recommendation:
Either:

  1. Implement the format parameter properly:
async search(args: any): Promise<any> {
  const normalized = this.normalizeParams(args);
  const { query, type, format = 'text', ...options } = normalized;
  
  // ... existing search logic ...
  
  if (format === 'json') {
    return {
      observations,
      sessions,
      prompts,
      totalResults: observations.length + sessions.length + prompts.length
    };
  }
  
  // ... existing formatted text output ...
}
  1. Or remove the unused parameter from export script

4. SessionStore Changes (src/services/sqlite/SessionStore.ts)

No issues found in the diff - this file only shows additions for batch API support which look good.

The new methods for batch fetching are well-implemented:

  • getSessionSummariesByIds (lines 1403-1428)
  • getUserPromptsByIds (lines 1435-1464)
  • getObservationsByIds (already existed, being used)

5. DataRoutes.ts (src/services/worker/http/routes/DataRoutes.ts)

Missing Endpoint:
The PR description mentions:

Added batch SDK sessions API endpoint (POST /api/sdk-sessions/batch)

But this endpoint is not implemented in DataRoutes.ts. The file shows:

  • /api/observations/batch (lines 102-128) ✅
  • /api/sdk-sessions/batch ❌ Missing

Recommendation:
Add the missing endpoint:

/**
 * Get SDK sessions by array of IDs
 * POST /api/sdk-sessions/batch
 * Body: { ids: string[] }
 */
private handleGetSessionsByIds = this.wrapHandler((req: Request, res: Response): void => {
  const { ids } = req.body;

  if (!ids || !Array.isArray(ids)) {
    this.badRequest(res, 'ids must be an array of strings');
    return;
  }

  if (ids.length === 0) {
    res.json([]);
    return;
  }

  const store = this.dbManager.getSessionStore();
  const placeholders = ids.map(() => '?').join(',');
  const sessions = store.db.prepare(`
    SELECT * FROM sdk_sessions
    WHERE sdk_session_id IN (${placeholders})
    ORDER BY started_at_epoch DESC
  `).all(...ids);

  res.json(sessions);
});

And register it:

app.post('/api/sdk-sessions/batch', this.handleGetSessionsByIds.bind(this));

Missing /api/import Endpoint:
The PR description also mentions:

Added import API endpoint (POST /api/import) with duplicate detection

This is also not implemented. Should it be added to DataRoutes.ts?


Testing Concerns

From the test plan:

  • Test export script with project filter - now returns correct results
  • Test import script with test data
  • Verify duplicate detection works (re-importing skips existing records)

Questions:

  1. Did the export script test include verifying that SDK sessions are correctly fetched?
  2. Was the import script tested with the worker running vs. not running?
  3. Was the format=json parameter actually tested, or is it unused code?

Recommendations Summary

Must Fix (blocking issues):

  1. Implement /api/sdk-sessions/batch endpoint in DataRoutes.ts
  2. Update export script to use /api/sdk-sessions/batch instead of direct DB access
  3. Decide on import script strategy: full API refactor or document as intentionally standalone
  4. Implement or remove the format=json parameter in SearchManager

Should Fix (code quality):

  1. Add better error messages to import script
  2. Document the format parameter if keeping it
  3. Consider implementing /api/import endpoint if import script will use API

Nice to Have:

  1. Move better-sqlite3 import to top of export script if keeping DB access
  2. Add integration tests for new batch endpoints
  3. Document the architectural decision about when to use API vs. direct DB access

Architecture Question

This PR highlights an important architectural question: When should scripts use the worker API vs. direct database access?

Current state:

  • Export: Hybrid (API for most, DB for sdk_sessions)
  • Import: Direct DB access only
  • Skills/MCP: Always use API

Recommendation:
Define a clear policy:

  • Option A: All external scripts use API (requires worker running)
  • Option B: Standalone scripts use direct DB, integrated features use API
  • Option C: Provide both modes (--use-api flag for scripts)

This should be documented in the codebase to guide future development.


Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the export/import memory scripts to use the worker HTTP API instead of direct database access, fixing issues with project filtering and providing a more consistent architecture.

Key Changes:

  • Added JSON format support to SearchManager for raw data output
  • Created new API endpoints: batch SDK sessions retrieval and bulk import with duplicate detection
  • Refactored both scripts to use HTTP API calls instead of direct better-sqlite3 access

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
src/services/worker/http/routes/DataRoutes.ts Added two new endpoints: POST /api/sdk-sessions/batch for retrieving session metadata and POST /api/import for importing memories with duplicate detection
src/services/worker/SearchManager.ts Added JSON format option to return raw data structures instead of formatted text for programmatic access
src/services/sqlite/SessionStore.ts Added getSdkSessionsBySessionIds method and four import methods with duplicate checking logic
scripts/import-memories.ts Refactored to use worker API instead of direct database access, significantly simplifying the code
scripts/export-memories.ts Replaced direct database queries with API calls to the new batch SDK sessions endpoint

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

this.badRequest(res, 'sdkSessionIds must be an array');
return;
}

Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sdkSessionIds array should be validated to ensure all elements are strings. Currently, if the array contains non-string values, the database query could fail or produce unexpected results.

Suggested change
// Validate all sdkSessionIds are strings
if (!sdkSessionIds.every(id => typeof id === 'string')) {
this.badRequest(res, 'All sdkSessionIds must be strings');
return;
}

Copilot uses AI. Check for mistakes.
Comment on lines +298 to +310
private handleImport = this.wrapHandler((req: Request, res: Response): void => {
const { sessions, summaries, observations, prompts } = req.body;

const stats = {
sessionsImported: 0,
sessionsSkipped: 0,
summariesImported: 0,
summariesSkipped: 0,
observationsImported: 0,
observationsSkipped: 0,
promptsImported: 0,
promptsSkipped: 0
};
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import endpoint lacks input validation. All four parameters (sessions, summaries, observations, prompts) should be validated to ensure they are arrays before being passed to the import methods. Without validation, non-array values could cause runtime errors in the database operations.

Copilot uses AI. Check for mistakes.
Comment on lines +1905 to +1906
WHERE sdk_session_id = ? AND title = ? AND created_at_epoch = ?
`).get(obs.sdk_session_id, obs.title, obs.created_at_epoch) as { id: number } | undefined;
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duplicate detection logic is incorrect when title is null. If multiple observations have the same sdk_session_id and created_at_epoch but null titles, they would be incorrectly identified as duplicates. Consider using COALESCE or IS NULL checks in the WHERE clause to handle this edge case properly.

Suggested change
WHERE sdk_session_id = ? AND title = ? AND created_at_epoch = ?
`).get(obs.sdk_session_id, obs.title, obs.created_at_epoch) as { id: number } | undefined;
WHERE sdk_session_id = ?
AND ((title = ?) OR (title IS NULL AND ? IS NULL))
AND created_at_epoch = ?
`).get(obs.sdk_session_id, obs.title, obs.title, obs.created_at_epoch) as { id: number } | undefined;

Copilot uses AI. Check for mistakes.
}

// Read and parse export file
const exportData = JSON.parse(readFileSync(inputFile, 'utf-8'));
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON.parse should be wrapped in a try-catch block to handle malformed JSON files gracefully. Currently, if the file contains invalid JSON, the script will crash with an unhelpful error message instead of providing clear feedback to the user.

Suggested change
const exportData = JSON.parse(readFileSync(inputFile, 'utf-8'));
let exportData;
try {
exportData = JSON.parse(readFileSync(inputFile, 'utf-8'));
} catch (err) {
console.error(`❌ Failed to parse JSON in input file: ${inputFile}`);
console.error(' Please ensure the file contains valid JSON.');
process.exit(1);
}

Copilot uses AI. Check for mistakes.
Comment on lines +298 to +366
private handleImport = this.wrapHandler((req: Request, res: Response): void => {
const { sessions, summaries, observations, prompts } = req.body;

const stats = {
sessionsImported: 0,
sessionsSkipped: 0,
summariesImported: 0,
summariesSkipped: 0,
observationsImported: 0,
observationsSkipped: 0,
promptsImported: 0,
promptsSkipped: 0
};

const store = this.dbManager.getSessionStore();

// Import sessions first (dependency for everything else)
if (Array.isArray(sessions)) {
for (const session of sessions) {
const result = store.importSdkSession(session);
if (result.imported) {
stats.sessionsImported++;
} else {
stats.sessionsSkipped++;
}
}
}

// Import summaries (depends on sessions)
if (Array.isArray(summaries)) {
for (const summary of summaries) {
const result = store.importSessionSummary(summary);
if (result.imported) {
stats.summariesImported++;
} else {
stats.summariesSkipped++;
}
}
}

// Import observations (depends on sessions)
if (Array.isArray(observations)) {
for (const obs of observations) {
const result = store.importObservation(obs);
if (result.imported) {
stats.observationsImported++;
} else {
stats.observationsSkipped++;
}
}
}

// Import prompts (depends on sessions)
if (Array.isArray(prompts)) {
for (const prompt of prompts) {
const result = store.importUserPrompt(prompt);
if (result.imported) {
stats.promptsImported++;
} else {
stats.promptsSkipped++;
}
}
}

res.json({
success: true,
stats
});
});
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import endpoint lacks test coverage. Given the complexity of this endpoint (handling four different types of imports with duplicate detection), comprehensive tests are needed to verify correct behavior including empty arrays, invalid data, duplicate detection, and proper error handling.

Copilot uses AI. Check for mistakes.
Comment on lines +314 to +360
// Import sessions first (dependency for everything else)
if (Array.isArray(sessions)) {
for (const session of sessions) {
const result = store.importSdkSession(session);
if (result.imported) {
stats.sessionsImported++;
} else {
stats.sessionsSkipped++;
}
}
}

// Import summaries (depends on sessions)
if (Array.isArray(summaries)) {
for (const summary of summaries) {
const result = store.importSessionSummary(summary);
if (result.imported) {
stats.summariesImported++;
} else {
stats.summariesSkipped++;
}
}
}

// Import observations (depends on sessions)
if (Array.isArray(observations)) {
for (const obs of observations) {
const result = store.importObservation(obs);
if (result.imported) {
stats.observationsImported++;
} else {
stats.observationsSkipped++;
}
}
}

// Import prompts (depends on sessions)
if (Array.isArray(prompts)) {
for (const prompt of prompts) {
const result = store.importUserPrompt(prompt);
if (result.imported) {
stats.promptsImported++;
} else {
stats.promptsSkipped++;
}
}
}
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance concern: Each import operation performs individual database queries for duplicate checking and insertion. For large imports, this could be slow. Consider batching the operations or using prepared statements with batch execution to improve performance, especially when importing thousands of records.

Copilot uses AI. Check for mistakes.
this.badRequest(res, 'sdkSessionIds must be an array');
return;
}

Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty array validation should be added. When sdkSessionIds is an empty array, the function should return an empty array immediately without making a database call, for consistency with getSdkSessionsBySessionIds behavior.

Suggested change
if (sdkSessionIds.length === 0) {
res.json([]);
return;
}

Copilot uses AI. Check for mistakes.
Comment on lines +158 to +169
private handleGetSdkSessionsByIds = this.wrapHandler((req: Request, res: Response): void => {
const { sdkSessionIds } = req.body;

if (!Array.isArray(sdkSessionIds)) {
this.badRequest(res, 'sdkSessionIds must be an array');
return;
}

const store = this.dbManager.getSessionStore();
const sessions = store.getSdkSessionsBySessionIds(sdkSessionIds);
res.json(sessions);
});
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new batch SDK sessions endpoint lacks test coverage. Other batch endpoints in this codebase have corresponding tests, and this new endpoint should follow the same pattern to ensure input validation and edge cases (empty arrays, invalid values) are properly tested.

Copilot uses AI. Check for mistakes.
files_edited: string | null;
notes: string | null;
prompt_number: number | null;
discovery_tokens: number;
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type signature inconsistency: discovery_tokens is typed as number but the code uses a fallback (discovery_tokens || 0), suggesting it could be null or undefined. The type should be number | null to match the actual data and the export script's interface definitions.

Suggested change
discovery_tokens: number;
discovery_tokens: number | null;

Copilot uses AI. Check for mistakes.
files_read: string | null;
files_modified: string | null;
prompt_number: number | null;
discovery_tokens: number;
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type signature inconsistency: discovery_tokens is typed as number but the code uses a fallback (discovery_tokens || 0), suggesting it could be null or undefined. The type should be number | null to match the actual data and the export script's interface definitions.

Suggested change
discovery_tokens: number;
discovery_tokens: number | null;

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants