Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added
- Added MCP and API key usage tracking to analytics dashboard. Move audit events from client-side to service functions to capture all API calls (web UI, MCP, and non-MCP). Display MCP requests and API requests on separate charts. [#948](https://github.com/sourcebot-dev/sourcebot/pull/948)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use consistent past tense in the changelog sentence.

Line 11 says “Move audit events…” while the rest of the entry is in past tense. This reads awkwardly in release notes.

✏️ Proposed wording tweak
-- Added MCP and API key usage tracking to analytics dashboard. Move audit events from client-side to service functions to capture all API calls (web UI, MCP, and non-MCP). Display MCP requests and API requests on separate charts. [`#948`](https://github.com/sourcebot-dev/sourcebot/pull/948)
+- Added MCP and API key usage tracking to the analytics dashboard. Moved audit events from client-side to service functions to capture all API calls (web UI, MCP, and non-MCP). Displayed MCP request and API request counts on separate charts. [`#948`](https://github.com/sourcebot-dev/sourcebot/pull/948)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Added MCP and API key usage tracking to analytics dashboard. Move audit events from client-side to service functions to capture all API calls (web UI, MCP, and non-MCP). Display MCP requests and API requests on separate charts. [#948](https://github.com/sourcebot-dev/sourcebot/pull/948)
- Added MCP and API key usage tracking to the analytics dashboard. Moved audit events from client-side to service functions to capture all API calls (web UI, MCP, and non-MCP). Displayed MCP request and API request counts on separate charts. [`#948`](https://github.com/sourcebot-dev/sourcebot/pull/948)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CHANGELOG.md` at line 11, The changelog entry mixes tenses—change the phrase
"Move audit events from client-side to service functions to capture all API
calls (web UI, MCP, and non-MCP)" to past tense (e.g., "Moved audit events from
client-side to service functions to capture all API calls (web UI, MCP, and
non-MCP)") so the entire bullet reads consistently in past tense with the rest
of the entry.


### Fixed
- Fixed search query parser rejecting parenthesized regex alternation in filter values (e.g. `file:(test|spec)`, `-file:(test|spec)`). [#946](https://github.com/sourcebot-dev/sourcebot/pull/946)
- Fixed `content:` filter ignoring the regex toggle. [#947](https://github.com/sourcebot-dev/sourcebot/pull/947)
Expand Down
49 changes: 30 additions & 19 deletions docs/docs/configuration/audit-logs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ This feature gives security and compliance teams the necessary information to en
## Enabling/Disabling Audit Logs
Audit logs are enabled by default and can be controlled with the `SOURCEBOT_EE_AUDIT_LOGGING_ENABLED` [environment variable](/docs/configuration/environment-variables).

## Retention Policy
By default, audit logs older than 180 days are automatically pruned daily. You can configure the retention period using the `SOURCEBOT_EE_AUDIT_RETENTION_DAYS` [environment variable](/docs/configuration/environment-variables). Set it to `0` to disable automatic pruning and retain logs indefinitely.

## Fetching Audit Logs
Audit logs are stored in the [postgres database](/docs/overview#architecture) connected to Sourcebot. To fetch all of the audit logs, you can use the following API:

Expand Down Expand Up @@ -110,30 +113,37 @@ curl --request GET '$SOURCEBOT_URL/api/ee/audit' \

| Action | Actor Type | Target Type |
| :------- | :------ | :------|
| `api_key.creation_failed` | `user` | `org` |
| `api_key.created` | `user` | `api_key` |
| `api_key.deletion_failed` | `user` | `org` |
| `api_key.creation_failed` | `user` | `org` |
| `api_key.deleted` | `user` | `api_key` |
| `api_key.deletion_failed` | `user` | `org` |
| `audit.fetch` | `user` | `org` |
| `chat.deleted` | `user` | `chat` |
| `chat.shared_with_users` | `user` | `chat` |
| `chat.unshared_with_user` | `user` | `chat` |
| `chat.visibility_updated` | `user` | `chat` |
| `org.ownership_transfer_failed` | `user` | `org` |
| `org.ownership_transferred` | `user` | `org` |
| `user.created_ask_chat` | `user` | `org` |
| `user.creation_failed` | `user` | `user` |
| `user.owner_created` | `user` | `org` |
| `user.performed_code_search` | `user` | `org` |
| `user.performed_find_references` | `user` | `org` |
| `user.performed_goto_definition` | `user` | `org` |
| `user.created_ask_chat` | `user` | `org` |
| `user.jit_provisioning_failed` | `user` | `org` |
| `user.jit_provisioned` | `user` | `org` |
| `user.join_request_creation_failed` | `user` | `org` |
| `user.join_requested` | `user` | `org` |
| `user.join_request_approve_failed` | `user` | `account_join_request` |
| `user.join_request_approved` | `user` | `account_join_request` |
| `user.invite_failed` | `user` | `org` |
| `user.invites_created` | `user` | `org` |
| `user.delete` | `user` | `user` |
| `user.fetched_file_source` | `user` | `org` |
| `user.fetched_file_tree` | `user` | `org` |
| `user.invite_accept_failed` | `user` | `invite` |
| `user.invite_accepted` | `user` | `invite` |
| `user.invite_failed` | `user` | `org` |
| `user.invites_created` | `user` | `org` |
| `user.join_request_approve_failed` | `user` | `account_join_request` |
| `user.join_request_approved` | `user` | `account_join_request` |
| `user.list` | `user` | `org` |
| `user.listed_repos` | `user` | `org` |
| `user.owner_created` | `user` | `org` |
| `user.performed_code_search` | `user` | `org` |
| `user.performed_find_references` | `user` | `org` |
| `user.performed_goto_definition` | `user` | `org` |
| `user.read` | `user` | `user` |
| `user.signed_in` | `user` | `user` |
| `user.signed_out` | `user` | `user` |
| `org.ownership_transfer_failed` | `user` | `org` |
| `org.ownership_transferred` | `user` | `org` |


## Response schema
Expand Down Expand Up @@ -180,7 +190,7 @@ curl --request GET '$SOURCEBOT_URL/api/ee/audit' \
},
"targetType": {
"type": "string",
"enum": ["user", "org", "file", "api_key", "account_join_request", "invite"]
"enum": ["user", "org", "file", "api_key", "account_join_request", "invite", "chat"]
},
"sourcebotVersion": {
"type": "string"
Expand All @@ -192,7 +202,8 @@ curl --request GET '$SOURCEBOT_URL/api/ee/audit' \
"properties": {
"message": { "type": "string" },
"api_key": { "type": "string" },
"emails": { "type": "string" }
"emails": { "type": "string" },
"source": { "type": "string" }
},
"additionalProperties": false
},
Expand Down
1 change: 1 addition & 0 deletions docs/docs/configuration/environment-variables.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ The following environment variables allow you to configure your Sourcebot deploy
| `HTTPS_PROXY` | - | <p>HTTPS proxy URL for routing SSL requests through a proxy server (e.g., `http://proxy.company.com:8080`). Requires `NODE_USE_ENV_PROXY=1`.</p> |
| `NO_PROXY` | - | <p>Comma-separated list of hostnames or domains that should bypass the proxy (e.g., `localhost,127.0.0.1,.internal.domain`). Requires `NODE_USE_ENV_PROXY=1`.</p> |
| `SOURCEBOT_EE_AUDIT_LOGGING_ENABLED` | `true` | <p>Enables/disables audit logging</p> |
| `SOURCEBOT_EE_AUDIT_RETENTION_DAYS` | `180` | <p>The number of days to retain audit logs. Audit log records older than this will be automatically pruned daily. Set to `0` to disable pruning and retain logs indefinitely.</p> |
| `AUTH_EE_GCP_IAP_ENABLED` | `false` | <p>When enabled, allows Sourcebot to automatically register/login from a successful GCP IAP redirect</p> |
| `AUTH_EE_GCP_IAP_AUDIENCE` | - | <p>The GCP IAP audience to use when verifying JWT tokens. Must be set to enable GCP IAP JIT provisioning</p> |
| `EXPERIMENT_EE_PERMISSION_SYNC_ENABLED` | `false` | <p>Enables [permission syncing](/docs/features/permission-syncing).</p> |
Expand Down
28 changes: 28 additions & 0 deletions docs/docs/deployment/sizing-guide.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,34 @@ If your instance is resource-constrained, you can reduce the concurrency of back

Lowering these values reduces peak resource usage at the cost of slower initial indexing.

## Audit log storage

<Info>
Audit logging is an enterprise feature and is only available with an [enterprise license](/docs/overview#license-key). If you are not on an enterprise plan, audit logs are not stored and this section does not apply.
</Info>

[Audit logs](/docs/configuration/audit-logs) are stored in the Postgres database connected to your Sourcebot deployment. Each audit record captures the action performed, the actor, the target, a timestamp, and optional metadata (e.g., request source). There are three database indexes on the audit table to support analytics and lookup queries.

**Estimated storage per audit event: ~350 bytes** (including row data and indexes).

<Info>
The table below assumes 50 events per user per day. The actual number depends on usage patterns — each user action (code search, file view, navigation, Ask chat, etc.) creates one audit event. Users who interact via [MCP](/docs/features/mcp-server) or the [API](/docs/api-reference/search) tend to generate significantly more events than web-only users, so your real usage may vary.
</Info>

| Team size | Avg events / user / day | Daily events | Monthly storage | 6-month storage |
|---|---|---|---|---|
| 10 users | 50 | 500 | ~5 MB | ~30 MB |
| 50 users | 50 | 2,500 | ~25 MB | ~150 MB |
| 100 users | 50 | 5,000 | ~50 MB | ~300 MB |
| 500 users | 50 | 25,000 | ~250 MB | ~1.5 GB |
| 1,000 users | 50 | 50,000 | ~500 MB | ~3 GB |

### Retention policy

By default, audit logs older than **180 days** are automatically pruned daily by a background job. You can adjust this with the `SOURCEBOT_EE_AUDIT_RETENTION_DAYS` [environment variable](/docs/configuration/environment-variables). Set it to `0` to disable pruning and retain logs indefinitely.

For most deployments, the default 180-day retention keeps database size manageable. If you have a large team with heavy MCP/API usage and need longer retention, plan your Postgres disk allocation accordingly using the estimates above.

## Monitoring

We recommend monitoring the following metrics after deployment to validate your sizing:
Expand Down
71 changes: 71 additions & 0 deletions packages/backend/src/ee/auditLogPruner.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
import { PrismaClient } from "@sourcebot/db";
import { createLogger, env } from "@sourcebot/shared";
import { setIntervalAsync } from "../utils.js";

const BATCH_SIZE = 10_000;
const ONE_DAY_MS = 24 * 60 * 60 * 1000;

const logger = createLogger('audit-log-pruner');

export class AuditLogPruner {
private interval?: NodeJS.Timeout;

constructor(private db: PrismaClient) {}

startScheduler() {
if (env.SOURCEBOT_EE_AUDIT_LOGGING_ENABLED !== 'true') {
logger.info('Audit logging is disabled, skipping audit log pruner.');
return;
}

if (env.SOURCEBOT_EE_AUDIT_RETENTION_DAYS <= 0) {
logger.info('SOURCEBOT_EE_AUDIT_RETENTION_DAYS is 0, audit log pruning is disabled.');
return;
}

logger.info(`Audit log pruner started. Retaining logs for ${env.SOURCEBOT_EE_AUDIT_RETENTION_DAYS} days.`);

// Run immediately on startup, then every 24 hours
this.pruneOldAuditLogs();
this.interval = setIntervalAsync(() => this.pruneOldAuditLogs(), ONE_DAY_MS);
}

async dispose() {
if (this.interval) {
clearInterval(this.interval);
this.interval = undefined;
}
}

private async pruneOldAuditLogs() {
const cutoff = new Date(Date.now() - env.SOURCEBOT_EE_AUDIT_RETENTION_DAYS * ONE_DAY_MS);
let totalDeleted = 0;

logger.info(`Pruning audit logs older than ${cutoff.toISOString()}...`);

// Delete in batches to avoid long-running transactions
while (true) {
const batch = await this.db.audit.findMany({
where: { timestamp: { lt: cutoff } },
select: { id: true },
take: BATCH_SIZE,
});

if (batch.length === 0) break;

const result = await this.db.audit.deleteMany({
where: { id: { in: batch.map(r => r.id) } },
});

totalDeleted += result.count;

if (batch.length < BATCH_SIZE) break;
}

if (totalDeleted > 0) {
logger.info(`Pruned ${totalDeleted} audit log records.`);
} else {
logger.info('No audit log records to prune.');
}
}
}
4 changes: 4 additions & 0 deletions packages/backend/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import { ConfigManager } from "./configManager.js";
import { ConnectionManager } from './connectionManager.js';
import { INDEX_CACHE_DIR, REPOS_CACHE_DIR, SHUTDOWN_SIGNALS } from './constants.js';
import { AccountPermissionSyncer } from "./ee/accountPermissionSyncer.js";
import { AuditLogPruner } from "./ee/auditLogPruner.js";
import { GithubAppManager } from "./ee/githubAppManager.js";
import { RepoPermissionSyncer } from './ee/repoPermissionSyncer.js';
import { shutdownPosthog } from "./posthog.js";
Expand Down Expand Up @@ -64,9 +65,11 @@ const repoPermissionSyncer = new RepoPermissionSyncer(prisma, settings, redis);
const accountPermissionSyncer = new AccountPermissionSyncer(prisma, settings, redis);
const repoIndexManager = new RepoIndexManager(prisma, settings, redis, promClient);
const configManager = new ConfigManager(prisma, connectionManager, env.CONFIG_PATH);
const auditLogPruner = new AuditLogPruner(prisma);

connectionManager.startScheduler();
repoIndexManager.startScheduler();
auditLogPruner.startScheduler();

if (env.EXPERIMENT_EE_PERMISSION_SYNC_ENABLED === 'true' && !hasEntitlement('permission-syncing')) {
logger.error('Permission syncing is not supported in current plan. Please contact team@sourcebot.dev for assistance.');
Expand Down Expand Up @@ -105,6 +108,7 @@ const listenToShutdownSignals = () => {
await connectionManager.dispose()
await repoPermissionSyncer.dispose()
await accountPermissionSyncer.dispose()
await auditLogPruner.dispose()
await configManager.dispose()

await prisma.$disconnect();
Expand Down
Loading