LLM benchmark tool updates by bradleyshep · Pull Request #4413 · clockworklabs/SpacetimeDB

bradleyshep · 2026-02-23T21:15:09Z

Description of Changes

LLM benchmark updates for local development:

Local SDK paths: Templates use relative paths to workspace crates (crates/bindings, crates/bindings-csharp, crates/bindings-typescript) instead of published packages, so the bench runs against local SDK changes.
NODEJS_DIR support: On Windows (e.g. nvm4w), if pnpm is not on PATH, the bench uses NODEJS_DIR to locate pnpm and prepends it to PATH for subprocesses.
Refactor: Extracted relative_to_workspace() in templates.rs and removed noisy NODEJS_DIR logging in publishers.rs.
Benchmark results: Updated docs/llms/llm-comparison-details.json and docs/llms/llm-comparison-summary.json.

API and ABI breaking changes

None.

Expected complexity level and risk

2 — Local-only changes to the benchmark tool. Templates now require local SDKs to be built (especially TypeScript: pnpm build in crates/bindings-typescript). No impact on published SDKs or runtime.

Testing

Run cargo llm run --lang rust --modes docs --providers openai from repo root
Run TypeScript benchmarks with pnpm build in crates/bindings-typescript first
On Windows with nvm4w, set NODEJS_DIR if pnpm is not on PATH and run TypeScript benchmarks

Co-authored-by: Cursor <cursoragent@cursor.com>

cloutiertyler

Very low risk because it only touches xtask-llm-benchmark

- map_or(false, ...) → is_some_and(...) in runner.rs and hashing.rs - repeat(..).take(n) → repeat_n(.., n) in templates.rs

- Add missing `dotenvy = "0.15"` dependency to Cargo.toml - Allow `clippy::type_complexity` and `clippy::enum_variant_names` at module level (benchmark tool, not library code) - Fix `map_or(false, ...)` → `is_some_and(...)`

LLM benchmark: split from llm-benchmark-updates

02f0c9c

Co-authored-by: Cursor <cursoragent@cursor.com>

bradleyshep requested a review from cloutiertyler February 23, 2026 21:15

bradleyshep and others added 2 commits February 23, 2026 16:18

Add Prerequisites section to DEVELOP.md

0bec2aa

Co-authored-by: Cursor <cursoragent@cursor.com>

fmt

a14050b

cloutiertyler added release-2.0-nice-to-have release-2.0 labels Feb 24, 2026

cloutiertyler approved these changes Feb 24, 2026

View reviewed changes

clockwork-labs-bot enabled auto-merge February 24, 2026 03:14

clockwork-labs-bot and others added 2 commits February 23, 2026 22:32

fix: resolve clippy lints in xtask-llm-benchmark

22cd5e7

- map_or(false, ...) → is_some_and(...) in runner.rs and hashing.rs - repeat(..).take(n) → repeat_n(.., n) in templates.rs

Fix lint failures: add dotenvy dep, suppress clippy warnings

9c61499

- Add missing `dotenvy = "0.15"` dependency to Cargo.toml - Allow `clippy::type_complexity` and `clippy::enum_variant_names` at module level (benchmark tool, not library code) - Fix `map_or(false, ...)` → `is_some_and(...)`

clockwork-labs-bot added this pull request to the merge queue Mar 1, 2026

Merged via the queue into master with commit efa6f38 Mar 1, 2026
32 of 33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM benchmark tool updates#4413

LLM benchmark tool updates#4413
clockwork-labs-bot merged 5 commits intomasterfrom
llm-benchmark-only

bradleyshep commented Feb 23, 2026

Uh oh!

cloutiertyler left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bradleyshep commented Feb 23, 2026

Description of Changes

API and ABI breaking changes

Expected complexity level and risk

Testing

Uh oh!

cloutiertyler left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants