Skip to content

LLM benchmark tool updates#4413

Merged
clockwork-labs-bot merged 5 commits intomasterfrom
llm-benchmark-only
Mar 1, 2026
Merged

LLM benchmark tool updates#4413
clockwork-labs-bot merged 5 commits intomasterfrom
llm-benchmark-only

Conversation

@bradleyshep
Copy link
Contributor

Description of Changes

LLM benchmark updates for local development:

  • Local SDK paths: Templates use relative paths to workspace crates (crates/bindings, crates/bindings-csharp, crates/bindings-typescript) instead of published packages, so the bench runs against local SDK changes.
  • NODEJS_DIR support: On Windows (e.g. nvm4w), if pnpm is not on PATH, the bench uses NODEJS_DIR to locate pnpm and prepends it to PATH for subprocesses.
  • Refactor: Extracted relative_to_workspace() in templates.rs and removed noisy NODEJS_DIR logging in publishers.rs.
  • Benchmark results: Updated docs/llms/llm-comparison-details.json and docs/llms/llm-comparison-summary.json.

API and ABI breaking changes

None.

Expected complexity level and risk

2 — Local-only changes to the benchmark tool. Templates now require local SDKs to be built (especially TypeScript: pnpm build in crates/bindings-typescript). No impact on published SDKs or runtime.

Testing

  • Run cargo llm run --lang rust --modes docs --providers openai from repo root
  • Run TypeScript benchmarks with pnpm build in crates/bindings-typescript first
  • On Windows with nvm4w, set NODEJS_DIR if pnpm is not on PATH and run TypeScript benchmarks

Co-authored-by: Cursor <cursoragent@cursor.com>
bradleyshep and others added 2 commits February 23, 2026 16:18
Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Contributor

@cloutiertyler cloutiertyler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very low risk because it only touches xtask-llm-benchmark

clockwork-labs-bot and others added 2 commits February 23, 2026 22:32
- map_or(false, ...) → is_some_and(...) in runner.rs and hashing.rs
- repeat(..).take(n) → repeat_n(.., n) in templates.rs
- Add missing `dotenvy = "0.15"` dependency to Cargo.toml
- Allow `clippy::type_complexity` and `clippy::enum_variant_names` at
  module level (benchmark tool, not library code)
- Fix `map_or(false, ...)` → `is_some_and(...)`
@clockwork-labs-bot clockwork-labs-bot added this pull request to the merge queue Mar 1, 2026
Merged via the queue into master with commit efa6f38 Mar 1, 2026
32 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants