-
Notifications
You must be signed in to change notification settings - Fork 3
feat: add memory profiling support #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Merging this PR will degrade performance by 39.4%
Performance Changes
Comparing |
7b0a7b4 to
1b97510
Compare
When running the warmup, we already call `start/stop_benchmark` which creates two memory profiles per benchmark, which breaks our assumptions in the backend.
b158911 to
003a1bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds memory profiling support to the benchmarking framework by introducing a new "memory" mode alongside the existing "simulation" and "walltime" modes. The changes unify the preprocessor directives and add the necessary configuration and infrastructure for memory profiling.
Changes:
- Renamed
CODSPEED_SIMULATIONpreprocessor macro toCODSPEED_ANALYSISto better represent both simulation and memory profiling modes - Added "memory" as a new valid mode in CMake and Bazel build configurations
- Removed warmup execution logic from simulation mode to reduce flakiness
- Added memory benchmark examples demonstrating various allocation patterns
- Updated CI workflows to test the new memory profiling mode with memtrack CLI installation
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| google_benchmark/src/benchmark_runner.cc | Updated preprocessor directive from CODSPEED_SIMULATION to CODSPEED_ANALYSIS |
| google_benchmark/src/benchmark_api_internal.h | Updated preprocessor directives and removed warmup execution logic |
| google_benchmark/src/benchmark_api_internal.cc | Removed warmup repetition code from simulation mode |
| google_benchmark/src/benchmark.cc | Updated preprocessor directives and console output message |
| google_benchmark/include/benchmark/benchmark.h | Updated preprocessor directives throughout the State class |
| examples/google_benchmark_cmake/memory_bench.hpp | Added new memory profiling benchmark examples with RLE encoding/decoding and allocation patterns |
| examples/google_benchmark_cmake/main.cpp | Added include for new memory benchmark header |
| examples/google_benchmark_bazel/memory_bench.hpp | Added symlink to cmake memory benchmark file |
| core/instrument-hooks | Updated submodule commit reference |
| core/CMakeLists.txt | Added "memory" to allowed modes and updated preprocessor definitions |
| core/BUILD | Added memory_mode configuration and updated preprocessor defines |
| .github/workflows/ci.yml | Added memory mode to test matrix and memtrack CLI installation steps |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #ifdef CODSPEED_ANALYSIS | ||
| State BenchmarkInstance::RunSimulation( | ||
| codspeed::CodSpeed* codspeed, internal::ThreadTimer* timer, | ||
| internal::ThreadManager* manager, | ||
| internal::PerfCountersMeasurement* perf_counters_measurement, | ||
| ProfilerManager* profiler_manager) const { |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name 'RunSimulation' is now misleading since CODSPEED_ANALYSIS covers both simulation and memory modes. Consider renaming to 'RunAnalysis' or a more generic name that reflects its broader purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be worth it to rename this indeed
| // Determine the width of the name field using a minimum width of 10. | ||
| #ifdef CODSPEED_SIMULATION | ||
| #ifdef CODSPEED_ANALYSIS | ||
| Err << "Codspeed mode: simulation" << "\n"; |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message still says 'simulation' but this code path is now used for both simulation and memory modes under CODSPEED_ANALYSIS. The message should differentiate between these modes or be more generic.
| Err << "Codspeed mode: simulation" << "\n"; | |
| Err << "Codspeed mode: analysis" << "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if printing analysis is the best UX, since we said it should be only the internal build mode. I guess we could forward the real measurement mode and then print that, while also enabling CODSPEED_WALLTIME/CODSPEED_ANALYSIS. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should stick to never displaying analysis in a user facing way.
Best way to proceed would be to either
- Read from env var, which is what interpreted languages do anyway
- Define a
CODSPEED_MODE_DISPLAYpre proc variable that we use here, and remove this cascading if elif
Most likely the best way to do this is actually the env var, but I'm fine with both
|
@not-matthias the ci fails, can you re-request a review once fixed ? 🙏 |
GuillaumeLagrange
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Olgtm, just wanted input from @art049 regarding the removal of the warmup run in analysis mode
| // Determine the width of the name field using a minimum width of 10. | ||
| #ifdef CODSPEED_SIMULATION | ||
| #ifdef CODSPEED_ANALYSIS | ||
| Err << "Codspeed mode: simulation" << "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should stick to never displaying analysis in a user facing way.
Best way to proceed would be to either
- Read from env var, which is what interpreted languages do anyway
- Define a
CODSPEED_MODE_DISPLAYpre proc variable that we use here, and remove this cascading if elif
Most likely the best way to do this is actually the env var, but I'm fine with both
| #ifdef CODSPEED_ANALYSIS | ||
| State BenchmarkInstance::RunSimulation( | ||
| codspeed::CodSpeed* codspeed, internal::ThreadTimer* timer, | ||
| internal::ThreadManager* manager, | ||
| internal::PerfCountersMeasurement* perf_counters_measurement, | ||
| ProfilerManager* profiler_manager) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be worth it to rename this indeed
| @@ -0,0 +1 @@ | |||
| ../google_benchmark_cmake/memory_bench.hpp No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no newline at EOF
| internal::ThreadManager* manager, | ||
| internal::PerfCountersMeasurement* perf_counters_measurement, | ||
| ProfilerManager* profiler_manager) const { | ||
| // Do one repetition to avoid flakiness due to inconcistencies in CPU cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@art049 a warmup run is something we do in interpreted language for jit reasons.
We imported this behavior here, since there may a point in unifying the cpu caches state and whatnot, but we currently do not do it in our rust integrations
Would you rather we introduce this in our rust integration, or remove this here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yes we should not do it. But this makes this a breaking change
TODO: REmove the custom memtrack installation before merging