Skip to content

rdw/embedding_cache

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EmbeddingCache

Test-time local caching of embedding computations. It tremendously speeds up test suites for projects that make heavy use of embeddings.

EmbeddingCache intercepts embedding requests during your test suite, records the real API responses to YAML fixture files, and replays them on subsequent runs — so your tests stay fast, deterministic, and free of network dependencies.

Compared to VCR

The main reasons to use EmbeddingCache over the VCR gem are efficiency and convenience.

The embedding cassettes are stored in compressed form, and if the same embedding is computed multiple times for the same model, it only stores one copy of the value.

EmbeddingCache acts as a memoization layer that returns a previously-computed embedding value when requested again. VCR enforces a linear replay of events: it raises exceptions when the order varies or a request is skipped. That might be what you want! But if what you want is fast, deterministic, offline tests, EmbeddingCache is the right tool.

It's perfectly reasonable to have one EmbeddingCache cassette for the entire test suite and it will consolidate and compress all of the embeddings any of the tests use.

Installation

Add to your Gemfile:

gem 'embedding_cache'

Then run bundle install.

Usage

Wrap any test that makes embedding calls in an EmbeddingCache.use block:

EmbeddingCache.use('my_fixture') do
  RubyLLM::Embedding.embed('hello world', model: 'text-embedding-3-small')
end

On the first-ever run, EmbeddingCache makes the real API call and records the response. On subsequent runs, it returns the recorded embedding without hitting the network.

To wrap the entire suite in a single cassette, for RSpec:

# in spec_helper.rb
config.around(:each) do |example|
  EmbeddingCache.use("global", record: :new_embeddings) do
    example.run
  end
end

Nested Blocks

It is possible to nest use blocks; the inner block simply overrides the outer block until it exits.

EmbeddingCache.use('outer') do
  RubyLLM::Embedding.embed('hello world', model: 'text-embedding-3-small')
  EmbeddingCache.use('override') do
    RubyLLM::Embedding.embed('bye', model: 'text-embedding-3-small')
  end
  RubyLLM::Embedding.embed('final', model: 'text-embedding-3-small')
end

In the above example, the embedding for "bye" would go into "override.yaml", and the other two embeddings would go into "outer.yaml".

Supported Libraries

EmbeddingCache auto-detects which embedding libraries are loaded and patches them automatically at test time:

  • ruby_llm — intercepts RubyLLM::Embedding.embed
  • ollama-ai — intercepts Ollama::Controllers::Client#embeddings

Recording Modes

Control how fixtures are recorded with the record: option:

EmbeddingCache.use('my_fixture', record: :once) do
  # ...
end
Mode Behavior
:once (default) Records on the first run when no fixture exists. Replays on subsequent runs. Raises an error if a new (unrecorded) embedding is requested against an existing fixture. This behavior matches VCR's.
:redo Clears the existing fixture and re-records everything from scratch unconditionally.
:new_embeddings Replays existing recordings and records any new embedding requests, merging them into the fixture.

You can also set a global default:

EmbeddingCache.default_record = :new_embeddings

Custom Fixture Directory

By default, fixtures are stored in spec/fixtures/embeddings/. Override this per call:

EmbeddingCache.use('my_fixture', fixture_dir: 'test/cassettes') do
  # ...
end

Fixture Storage

Fixtures are stored as YAML files at <fixture_dir>/<fixture_name>.yaml. Make sure to check these files in with your tests.

Each file contains an array of recorded embeddings:

---
- text: hello world
  model: text-embedding-3-small
  adapter: ruby_llm
  embedding: 1eJyLNtAz1DHQM4oFAAoLAgQ=

Embedding vectors are compressed (zlib + base64) to keep fixture files compact. The 1 prefix is a format version marker.

Once Mode

In :once mode, when a fixture exists but doesn't contain a requested embedding, EmbeddingCache raises EmbeddingCache::UnhandledEmbedRequestError with details about the missing embedding and the fixture file path.

Thread Safety

EmbeddingCache is intended to be safe to use in multi-threaded test environments. Please report any bugs found.

Development

This project uses mise-en-place to manage Ruby versions and manage small tasks. Install it first for your platform, then run:

mise install
mise setup
mise rspec

Note that the tests really do make requests to Gemini and a local Ollama. Put a GEMINI_API_KEY in a .env file, and run ollama, for the full suite to run.

About

Test-time local caching of embedding computations in Ruby.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages