Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions packages/shared/src/env.server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,8 @@ export const env = createEnv({

SOURCEBOT_CHAT_MODEL_TEMPERATURE: numberSchema.default(0.3),
SOURCEBOT_CHAT_MAX_STEP_COUNT: numberSchema.default(20),
SOURCEBOT_CHAT_FILE_MAX_CHARACTERS: numberSchema.default(100_000),
SOURCEBOT_CHAT_MAX_MESSAGE_HISTORY: numberSchema.default(50),
Comment on lines +234 to +235
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing lower-bound validation; SOURCEBOT_CHAT_MAX_MESSAGE_HISTORY=0 silently disables trimming

numberSchema is z.coerce.number() with no minimum. For SOURCEBOT_CHAT_MAX_MESSAGE_HISTORY, a value of 0 produces a JavaScript gotcha: Array.prototype.slice(-0) is identical to slice(0), returning the full array — so history trimming is silently skipped rather than capping to zero messages. Negative values produce similarly unexpected results.

For SOURCEBOT_CHAT_FILE_MAX_CHARACTERS, 0 would truncate every file to an empty body with just the notice.

🛡️ Proposed fix
- SOURCEBOT_CHAT_FILE_MAX_CHARACTERS: numberSchema.default(100_000),
- SOURCEBOT_CHAT_MAX_MESSAGE_HISTORY: numberSchema.default(50),
+ SOURCEBOT_CHAT_FILE_MAX_CHARACTERS: numberSchema.min(1).default(100_000),
+ SOURCEBOT_CHAT_MAX_MESSAGE_HISTORY: numberSchema.min(1).default(50),

Alternatively, guard against maxMessages <= 0 in route.ts before performing the slice.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
SOURCEBOT_CHAT_FILE_MAX_CHARACTERS: numberSchema.default(100_000),
SOURCEBOT_CHAT_MAX_MESSAGE_HISTORY: numberSchema.default(50),
SOURCEBOT_CHAT_FILE_MAX_CHARACTERS: numberSchema.min(1).default(100_000),
SOURCEBOT_CHAT_MAX_MESSAGE_HISTORY: numberSchema.min(1).default(50),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/shared/src/env.server.ts` around lines 234 - 235, The env schema
allows zero/negative values because numberSchema (z.coerce.number()) has no
minimum, causing slice(-0)/slice(0) and full truncation edge-cases; update the
SOURCEBOT_CHAT_MAX_MESSAGE_HISTORY and SOURCEBOT_CHAT_FILE_MAX_CHARACTERS
entries to enforce a sensible lower bound (e.g., add .min(1) to numberSchema
before .default or use numberSchema.min(1).default(...)) so invalid 0/negative
values are rejected, or alternatively add an explicit guard in the chat trimming
logic (in route.ts) to treat <=0 as 1 or skip trimming; update the schema
entries SOURCEBOT_CHAT_MAX_MESSAGE_HISTORY and
SOURCEBOT_CHAT_FILE_MAX_CHARACTERS accordingly.


DEBUG_WRITE_CHAT_MESSAGES_TO_FILE: booleanSchema.default('false'),

Expand Down
25 changes: 16 additions & 9 deletions packages/web/src/app/api/(server)/chat/route.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import { sew } from "@/actions";
import { _getConfiguredLanguageModelsFull, _getAISDKLanguageModelAndOptions, _updateChatMessages, _isOwnerOfChat } from "@/features/chat/actions";
import { createAgentStream } from "@/features/chat/agent";
import { additionalChatRequestParamsSchema, LanguageModelInfo, SBChatMessage, SearchScope } from "@/features/chat/types";
import { getAnswerPartFromAssistantMessage, getLanguageModelKey } from "@/features/chat/utils";
import { getAnswerPartFromAssistantMessage, getLanguageModelKey, isContextWindowError, CONTEXT_WINDOW_USER_MESSAGE } from "@/features/chat/utils";
import { apiHandler } from "@/lib/apiHandler";
import { ErrorCode } from "@/lib/errorCodes";
import { notFound, requestBodySchemaValidationError, ServiceError, serviceErrorResponse } from "@/lib/serviceError";
Expand All @@ -11,7 +11,7 @@ import { withOptionalAuthV2 } from "@/withAuthV2";
import { LanguageModelV2 as AISDKLanguageModelV2 } from "@ai-sdk/provider";
import * as Sentry from "@sentry/nextjs";
import { PrismaClient } from "@sourcebot/db";
import { createLogger } from "@sourcebot/shared";
import { createLogger, env } from "@sourcebot/shared";
import { captureEvent } from "@/lib/posthog";
import {
createUIMessageStream,
Expand Down Expand Up @@ -114,15 +114,17 @@ export const POST = apiHandler(async (req: NextRequest) => {
return 'unknown error';
}

if (typeof error === 'string') {
return error;
}
const errorMessage = (() => {
if (typeof error === 'string') return error;
if (error instanceof Error) return error.message;
return JSON.stringify(error);
})();

if (error instanceof Error) {
return error.message;
if (isContextWindowError(errorMessage)) {
return CONTEXT_WINDOW_USER_MESSAGE;
}

return JSON.stringify(error);
return errorMessage;
}
});

Expand Down Expand Up @@ -203,6 +205,11 @@ export const createMessageStream = async ({
}
}).filter(message => message !== undefined);

const maxMessages = env.SOURCEBOT_CHAT_MAX_MESSAGE_HISTORY;
const trimmedMessageHistory = messageHistory.length > maxMessages
? messageHistory.slice(-maxMessages)
: messageHistory;
Comment on lines +208 to +211
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

slice(-maxMessages) can produce a history that starts with an assistant message

messageHistory follows the pattern [user₁, assistant₁, user₂, assistant₂, …, userₙ]. When messageHistory.length > maxMessages and maxMessages is even (e.g. 50), slice(-50) drops the oldest user message and the resulting history begins with an orphaned assistant turn. Providers like Anthropic's Messages API require the first turn to be "user" and will reject the request with an error when this constraint is violated.

🐛 Proposed fix — ensure trimmed history always starts with a user message
  const maxMessages = env.SOURCEBOT_CHAT_MAX_MESSAGE_HISTORY;
- const trimmedMessageHistory = messageHistory.length > maxMessages
-     ? messageHistory.slice(-maxMessages)
-     : messageHistory;
+ let trimmedMessageHistory = messageHistory.length > maxMessages
+     ? messageHistory.slice(-maxMessages)
+     : messageHistory;
+ // Providers (e.g., Anthropic) require the first message to be from the user.
+ // If trimming produced an assistant-first sequence, drop the leading assistant turn.
+ if (trimmedMessageHistory.length > 0 && trimmedMessageHistory[0].role === 'assistant') {
+     trimmedMessageHistory = trimmedMessageHistory.slice(1);
+ }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/app/api/`(server)/chat/route.ts around lines 208 - 211, The
current trim uses slice(-maxMessages) which can start with an assistant message
and break providers that require the first turn to be "user"; change the
trimming logic around messageHistory, maxMessages and trimmedMessageHistory so
you compute a start index = messageHistory.length - maxMessages, then if
messageHistory[start].role === "assistant" increment start by 1 (to ensure the
first retained turn is a user) before slicing; keep other behavior the same when
messageHistory.length <= maxMessages.


const stream = createUIMessageStream<SBChatMessage>({
execute: async ({ writer }) => {
writer.write({
Expand Down Expand Up @@ -238,7 +245,7 @@ export const createMessageStream = async ({
const researchStream = await createAgentStream({
model,
providerOptions: modelProviderOptions,
inputMessages: messageHistory,
inputMessages: trimmedMessageHistory,
inputSources: sources,
selectedRepos: expandedRepos,
onWriteSource: (source) => {
Expand Down
18 changes: 16 additions & 2 deletions packages/web/src/features/chat/agent.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ import { LanguageModel, ModelMessage, StopCondition, streamText } from "ai";
import { ANSWER_TAG, FILE_REFERENCE_PREFIX, toolNames } from "./constants";
import { createCodeSearchTool, findSymbolDefinitionsTool, findSymbolReferencesTool, listReposTool, listCommitsTool, readFilesTool } from "./tools";
import { Source } from "./types";
import { addLineNumbers, fileReferenceToString } from "./utils";
import { addLineNumbers, fileReferenceToString, truncateFileContent } from "./utils";
import _dedent from "dedent";

const dedent = _dedent.withOptions({ alignValues: true });
Expand Down Expand Up @@ -60,9 +60,20 @@ export const createAgentStream = async ({
}))
).filter((source) => source !== undefined);

const maxChars = env.SOURCEBOT_CHAT_FILE_MAX_CHARACTERS;
let anyFileTruncated = false;
const truncatedFileSources = resolvedFileSources.map((file) => {
const { content, wasTruncated } = truncateFileContent(file.source, maxChars);
if (wasTruncated) {
anyFileTruncated = true;
}
return { ...file, source: content };
});

const systemPrompt = createPrompt({
repos: selectedRepos,
files: resolvedFileSources,
files: truncatedFileSources,
filesWereTruncated: anyFileTruncated,
});

const stream = streamText({
Expand Down Expand Up @@ -148,6 +159,7 @@ export const createAgentStream = async ({
const createPrompt = ({
files,
repos,
filesWereTruncated,
}: {
files?: {
path: string;
Expand All @@ -157,6 +169,7 @@ const createPrompt = ({
revision: string;
}[],
repos: string[],
filesWereTruncated?: boolean,
}) => {
return dedent`
You are a powerful agentic AI code assistant built into Sourcebot, the world's best code-intelligence platform. Your job is to help developers understand and navigate their large codebases.
Expand Down Expand Up @@ -189,6 +202,7 @@ const createPrompt = ({

${(files && files.length > 0) ? dedent`
<files>
${filesWereTruncated ? `**Note:** Some files were truncated because they exceeded the character limit. Use the readFiles tool to retrieve specific sections if needed.` : ''}
The user has mentioned the following files, which are automatically included for analysis.

${files?.map(file => `<file path="${file.path}" repository="${file.repo}" language="${file.language}" revision="${file.revision}">
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import { Button } from '@/components/ui/button';
import { serviceErrorSchema } from '@/lib/serviceError';
import { CONTEXT_WINDOW_USER_MESSAGE } from '@/features/chat/utils';
import { AlertCircle, X } from "lucide-react";
import { useMemo } from 'react';

Expand Down Expand Up @@ -33,7 +34,7 @@ export const ErrorBanner = ({ error, isVisible, onClose }: ErrorBannerProps) =>
<div className="flex items-center gap-2">
<AlertCircle className="h-4 w-4 text-red-600 dark:text-red-400" />
<span className="text-sm font-medium text-red-800 dark:text-red-200">
Error occurred
{errorMessage === CONTEXT_WINDOW_USER_MESSAGE ? 'Context limit exceeded' : 'Error occurred'}
</span>
<span className="text-sm text-red-600 dark:text-red-400">
{errorMessage}
Expand Down
20 changes: 12 additions & 8 deletions packages/web/src/features/chat/tools.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ import { InferToolInput, InferToolOutput, InferUITool, tool, ToolUIPart } from "
import { isServiceError } from "@/lib/utils";
import { FileSourceResponse, getFileSource, listCommits } from '@/features/git';
import { findSearchBasedSymbolDefinitions, findSearchBasedSymbolReferences } from "../codeNav/api";
import { addLineNumbers } from "./utils";
import { addLineNumbers, truncateFileContent } from "./utils";
import { env } from "@sourcebot/shared";
import { toolNames } from "./constants";
import { listReposQueryParamsSchema } from "@/lib/schemas";
import { ListReposQueryParams } from "@/lib/types";
Expand Down Expand Up @@ -123,13 +124,16 @@ export const readFilesTool = tool({
return firstError!;
}

return (responses as FileSourceResponse[]).map((response) => ({
path: response.path,
repository: response.repo,
language: response.language,
source: addLineNumbers(response.source),
revision,
}));
return (responses as FileSourceResponse[]).map((response) => {
const { content } = truncateFileContent(response.source, env.SOURCEBOT_CHAT_FILE_MAX_CHARACTERS);
return {
path: response.path,
repository: response.repo,
language: response.language,
source: addLineNumbers(content),
revision,
};
});
}
});

Expand Down
72 changes: 71 additions & 1 deletion packages/web/src/features/chat/utils.test.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { expect, test, vi } from 'vitest'
import { fileReferenceToString, getAnswerPartFromAssistantMessage, groupMessageIntoSteps, repairReferences } from './utils'
import { fileReferenceToString, getAnswerPartFromAssistantMessage, groupMessageIntoSteps, repairReferences, truncateFileContent, isContextWindowError, CONTEXT_WINDOW_USER_MESSAGE } from './utils'
import { FILE_REFERENCE_REGEX, ANSWER_TAG } from './constants';
import { SBChatMessage, SBChatMessagePart } from './types';

Expand Down Expand Up @@ -351,3 +351,73 @@ test('repairReferences handles malformed inline code blocks', () => {
const expected = 'See @file:{github.com/sourcebot-dev/sourcebot::packages/web/src/auth.ts} for details.';
expect(repairReferences(input)).toBe(expected);
});

// truncateFileContent tests

test('truncateFileContent returns content unchanged when under limit', () => {
const source = 'line 1\nline 2\nline 3';
const result = truncateFileContent(source, 100);
expect(result.content).toBe(source);
expect(result.wasTruncated).toBe(false);
});

test('truncateFileContent returns content unchanged when exactly at limit', () => {
const source = 'abcde';
const result = truncateFileContent(source, 5);
expect(result.content).toBe(source);
expect(result.wasTruncated).toBe(false);
});

test('truncateFileContent truncates at line boundary when over limit', () => {
const source = 'line 1\nline 2\nline 3\nline 4\nline 5';
// Limit of 20 characters: "line 1\nline 2\nline 3" is 20 chars
const result = truncateFileContent(source, 15);
Comment on lines +373 to +374
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Stale comment — limit is 15, not 20

The inline comment says "Limit of 20 characters" but the call uses truncateFileContent(source, 15). The string 'line 1\nline 2\nline 3' is indeed 20 chars (describing the full third-line boundary), but the test limit is 15. The comment is misleading.

📝 Suggested fix
-    // Limit of 20 characters: "line 1\nline 2\nline 3" is 20 chars
+    // Limit of 15 characters: last newline before index 15 is at index 13 ("line 1\nline 2")
     const result = truncateFileContent(source, 15);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Limit of 20 characters: "line 1\nline 2\nline 3" is 20 chars
const result = truncateFileContent(source, 15);
// Limit of 15 characters: last newline before index 15 is at index 13 ("line 1\nline 2")
const result = truncateFileContent(source, 15);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/features/chat/utils.test.ts` around lines 373 - 374, Update
the stale inline comment to match the actual test limit: change the comment that
currently reads "Limit of 20 characters" to reflect the 15-character limit used
in the test call to truncateFileContent(source, 15), ensuring the comment
accurately describes the expected behavior for truncateFileContent in this test.

expect(result.wasTruncated).toBe(true);
expect(result.content).toContain('line 1\nline 2');
expect(result.content).toContain('... [truncated:');
expect(result.content).not.toContain('line 4');
});

test('truncateFileContent includes line count information', () => {
const source = 'a\nb\nc\nd\ne';
const result = truncateFileContent(source, 3);
expect(result.wasTruncated).toBe(true);
expect(result.content).toMatch(/showing \d+ of 5 lines/);
});

// isContextWindowError tests

test('isContextWindowError detects OpenAI context length error', () => {
expect(isContextWindowError('This model\'s maximum context length is 128000 tokens')).toBe(true);
});

test('isContextWindowError detects Anthropic prompt too long error', () => {
expect(isContextWindowError('prompt is too long: 150000 tokens > 100000 maximum')).toBe(true);
});

test('isContextWindowError detects context_length_exceeded error', () => {
expect(isContextWindowError('context_length_exceeded')).toBe(true);
});

test('isContextWindowError detects token limit error', () => {
expect(isContextWindowError('Request exceeds the maximum number of tokens')).toBe(true);
});

test('isContextWindowError detects reduce the length error', () => {
expect(isContextWindowError('Please reduce the length of the messages')).toBe(true);
});

test('isContextWindowError detects request too large error', () => {
expect(isContextWindowError('request too large')).toBe(true);
});

test('isContextWindowError returns false for unrelated errors', () => {
expect(isContextWindowError('Internal server error')).toBe(false);
expect(isContextWindowError('Rate limit exceeded')).toBe(false);
expect(isContextWindowError('Invalid API key')).toBe(false);
});

test('CONTEXT_WINDOW_USER_MESSAGE is a non-empty string', () => {
expect(typeof CONTEXT_WINDOW_USER_MESSAGE).toBe('string');
expect(CONTEXT_WINDOW_USER_MESSAGE.length).toBeGreaterThan(0);
});
42 changes: 42 additions & 0 deletions packages/web/src/features/chat/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,48 @@ export const addLineNumbers = (source: string, lineOffset = 1) => {
return source.split('\n').map((line, index) => `${index + lineOffset}:${line}`).join('\n');
}

export const truncateFileContent = (
source: string,
maxCharacters: number,
): { content: string; wasTruncated: boolean } => {
if (source.length <= maxCharacters) {
return { content: source, wasTruncated: false };
}

const cutoff = source.lastIndexOf('\n', maxCharacters);
const effectiveCutoff = cutoff > 0 ? cutoff : maxCharacters;
const truncated = source.substring(0, effectiveCutoff);

const totalLines = source.split('\n').length;
const includedLines = truncated.split('\n').length;

return {
content: truncated + `\n\n... [truncated: showing ${includedLines} of ${totalLines} lines]`,
wasTruncated: true,
};
};

const CONTEXT_WINDOW_ERROR_PATTERNS = [
/maximum context length/i,
/prompt is too long/i,
/context.?length.?exceeded/i,
/exceeds? the maximum.*tokens?/i,
/token.?limit/i,
/request.?too.?large/i,
/input.?too.?long/i,
/request payload size exceeds/i,
/max_tokens/i,
/reduce the length/i,
];

export const isContextWindowError = (errorMessage: string): boolean => {
return CONTEXT_WINDOW_ERROR_PATTERNS.some((pattern) => pattern.test(errorMessage));
};

export const CONTEXT_WINDOW_USER_MESSAGE =
'The conversation exceeded the model\'s context window limit. ' +
'Try removing some attached files, starting a new conversation, or switching to a model with a larger context window.';

export const createUIMessage = (text: string, mentions: MentionData[], selectedSearchScopes: SearchScope[]): CreateUIMessage<SBChatMessage> => {
// Converts applicable mentions into sources.
const sources: Source[] = mentions
Expand Down