EmbeddingCache

Test-time local caching of embedding computations. It tremendously speeds up test suites for projects that make heavy use of embeddings.

EmbeddingCache intercepts embedding requests during your test suite, records the real API responses to YAML fixture files, and replays them on subsequent runs — so your tests stay fast, deterministic, and free of network dependencies.

Compared to VCR

The main reasons to use EmbeddingCache over the VCR gem are efficiency and convenience.

The embedding cassettes are stored in compressed form, and if the same embedding is computed multiple times for the same model, it only stores one copy of the value.

EmbeddingCache acts as a memoization layer that returns a previously-computed embedding value when requested again. VCR enforces a linear replay of events: it raises exceptions when the order varies or a request is skipped. That might be what you want! But if what you want is fast, deterministic, offline tests, EmbeddingCache is the right tool.

It's perfectly reasonable to have one EmbeddingCache cassette for the entire test suite and it will consolidate and compress all of the embeddings any of the tests use.

Installation

Add to your Gemfile:

gem 'embedding_cache'

Then run bundle install.

Usage

Wrap any test that makes embedding calls in an EmbeddingCache.use block:

EmbeddingCache.use('my_fixture') do
  RubyLLM::Embedding.embed('hello world', model: 'text-embedding-3-small')
end

On the first-ever run, EmbeddingCache makes the real API call and records the response. On subsequent runs, it returns the recorded embedding without hitting the network.

To wrap the entire suite in a single cassette, for RSpec:

# in spec_helper.rb
config.around(:each) do |example|
  EmbeddingCache.use("global", record: :new_embeddings) do
    example.run
  end
end

Nested Blocks

It is possible to nest use blocks; the inner block simply overrides the outer block until it exits.

EmbeddingCache.use('outer') do
  RubyLLM::Embedding.embed('hello world', model: 'text-embedding-3-small')
  EmbeddingCache.use('override') do
    RubyLLM::Embedding.embed('bye', model: 'text-embedding-3-small')
  end
  RubyLLM::Embedding.embed('final', model: 'text-embedding-3-small')
end

In the above example, the embedding for "bye" would go into "override.yaml", and the other two embeddings would go into "outer.yaml".

Supported Libraries

EmbeddingCache auto-detects which embedding libraries are loaded and patches them automatically at test time:

ruby_llm — intercepts RubyLLM::Embedding.embed
ollama-ai — intercepts Ollama::Controllers::Client#embeddings

Recording Modes

Control how fixtures are recorded with the record: option:

EmbeddingCache.use('my_fixture', record: :once) do
  # ...
end

Mode	Behavior
`:once` (default)	Records on the first run when no fixture exists. Replays on subsequent runs. Raises an error if a new (unrecorded) embedding is requested against an existing fixture. This behavior matches VCR's.
`:redo`	Clears the existing fixture and re-records everything from scratch unconditionally.
`:new_embeddings`	Replays existing recordings and records any new embedding requests, merging them into the fixture.

You can also set a global default:

EmbeddingCache.default_record = :new_embeddings

Custom Fixture Directory

By default, fixtures are stored in spec/fixtures/embeddings/. Override this per call:

EmbeddingCache.use('my_fixture', fixture_dir: 'test/cassettes') do
  # ...
end

Fixture Storage

Fixtures are stored as YAML files at <fixture_dir>/<fixture_name>.yaml. Make sure to check these files in with your tests.

Each file contains an array of recorded embeddings:

---
- text: hello world
  model: text-embedding-3-small
  adapter: ruby_llm
  embedding: 1eJyLNtAz1DHQM4oFAAoLAgQ=

Embedding vectors are compressed (zlib + base64) to keep fixture files compact. The 1 prefix is a format version marker.

Once Mode

In :once mode, when a fixture exists but doesn't contain a requested embedding, EmbeddingCache raises EmbeddingCache::UnhandledEmbedRequestError with details about the missing embedding and the fixture file path.

Thread Safety

EmbeddingCache is intended to be safe to use in multi-threaded test environments. Please report any bugs found.

Development

This project uses mise-en-place to manage Ruby versions and manage small tasks. Install it first for your platform, then run:

mise install
mise setup
mise rspec

Note that the tests really do make requests to Gemini and a local Ollama. Put a GEMINI_API_KEY in a .env file, and run ollama, for the full suite to run.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
lib		lib
spec		spec
.gitignore		.gitignore
.rspec		.rspec
.rubocop.yml		.rubocop.yml
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE		LICENSE
README.md		README.md
embedding_cache.gemspec		embedding_cache.gemspec
mise.toml		mise.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EmbeddingCache

Compared to VCR

Installation

Usage

Nested Blocks

Supported Libraries

Recording Modes

Custom Fixture Directory

Fixture Storage

Once Mode

Thread Safety

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EmbeddingCache

Compared to VCR

Installation

Usage

Nested Blocks

Supported Libraries

Recording Modes

Custom Fixture Directory

Fixture Storage

Once Mode

Thread Safety

Development

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages