Test-time local caching of embedding computations. It tremendously speeds up test suites for projects that make heavy use of embeddings.
EmbeddingCache intercepts embedding requests during your test suite, records the real API responses to YAML fixture files, and replays them on subsequent runs — so your tests stay fast, deterministic, and free of network dependencies.
The main reasons to use EmbeddingCache over the VCR gem are efficiency and convenience.
The embedding cassettes are stored in compressed form, and if the same embedding is computed multiple times for the same model, it only stores one copy of the value.
EmbeddingCache acts as a memoization layer that returns a previously-computed embedding value when requested again. VCR enforces a linear replay of events: it raises exceptions when the order varies or a request is skipped. That might be what you want! But if what you want is fast, deterministic, offline tests, EmbeddingCache is the right tool.
It's perfectly reasonable to have one EmbeddingCache cassette for the entire test suite and it will consolidate and compress all of the embeddings any of the tests use.
Add to your Gemfile:
gem 'embedding_cache'Then run bundle install.
Wrap any test that makes embedding calls in an EmbeddingCache.use block:
EmbeddingCache.use('my_fixture') do
RubyLLM::Embedding.embed('hello world', model: 'text-embedding-3-small')
endOn the first-ever run, EmbeddingCache makes the real API call and records the response. On subsequent runs, it returns the recorded embedding without hitting the network.
To wrap the entire suite in a single cassette, for RSpec:
# in spec_helper.rb
config.around(:each) do |example|
EmbeddingCache.use("global", record: :new_embeddings) do
example.run
end
endIt is possible to nest use blocks; the inner block simply overrides the outer block until it exits.
EmbeddingCache.use('outer') do
RubyLLM::Embedding.embed('hello world', model: 'text-embedding-3-small')
EmbeddingCache.use('override') do
RubyLLM::Embedding.embed('bye', model: 'text-embedding-3-small')
end
RubyLLM::Embedding.embed('final', model: 'text-embedding-3-small')
endIn the above example, the embedding for "bye" would go into "override.yaml", and the other two embeddings would go into "outer.yaml".
EmbeddingCache auto-detects which embedding libraries are loaded and patches them automatically at test time:
- ruby_llm — intercepts
RubyLLM::Embedding.embed - ollama-ai — intercepts
Ollama::Controllers::Client#embeddings
Control how fixtures are recorded with the record: option:
EmbeddingCache.use('my_fixture', record: :once) do
# ...
end| Mode | Behavior |
|---|---|
:once (default) |
Records on the first run when no fixture exists. Replays on subsequent runs. Raises an error if a new (unrecorded) embedding is requested against an existing fixture. This behavior matches VCR's. |
:redo |
Clears the existing fixture and re-records everything from scratch unconditionally. |
:new_embeddings |
Replays existing recordings and records any new embedding requests, merging them into the fixture. |
You can also set a global default:
EmbeddingCache.default_record = :new_embeddingsBy default, fixtures are stored in spec/fixtures/embeddings/. Override this per call:
EmbeddingCache.use('my_fixture', fixture_dir: 'test/cassettes') do
# ...
endFixtures are stored as YAML files at <fixture_dir>/<fixture_name>.yaml. Make sure to check these files in with your tests.
Each file contains an array of recorded embeddings:
---
- text: hello world
model: text-embedding-3-small
adapter: ruby_llm
embedding: 1eJyLNtAz1DHQM4oFAAoLAgQ=Embedding vectors are compressed (zlib + base64) to keep fixture files compact. The 1 prefix is a format version marker.
In :once mode, when a fixture exists but doesn't contain a requested embedding, EmbeddingCache raises EmbeddingCache::UnhandledEmbedRequestError with details about the missing embedding and the fixture file path.
EmbeddingCache is intended to be safe to use in multi-threaded test environments. Please report any bugs found.
This project uses mise-en-place to manage Ruby versions and manage small tasks. Install it first for your platform, then run:
mise install
mise setup
mise rspecNote that the tests really do make requests to Gemini and a local Ollama. Put a GEMINI_API_KEY in a .env file, and run ollama, for the full suite to run.