Dkorzekwa/anymodel subblock stats#1085
Conversation
- Add converter, model_descriptor, puzzformer, and llama model support - Selective merge of anymodel functionality Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…s merged) Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
📝 WalkthroughWalkthroughThe PR refactors teacher configuration loading and memory/parameter computation by introducing a descriptor-driven approach through Changes
Sequence DiagramsequenceDiagram
actor Caller
participant sweep.py
participant ModelDescriptorFactory
participant load_model_config
participant OmegaConf as OmegaConf
participant filter_fn as filter_subblock_stats_by_args
participant json as subblock_stats.json
Caller->>sweep.py: _load_teacher_subblock_stats(hydra_cfg)
sweep.py->>ModelDescriptorFactory: get_model_descriptor(teacher_name)
ModelDescriptorFactory-->>sweep.py: descriptor
sweep.py->>load_model_config: load_model_config(descriptor)
load_model_config-->>sweep.py: model_config (hidden_size)
sweep.py->>OmegaConf: to_container(subblock_stats_args[0])
OmegaConf-->>sweep.py: normalized_args
sweep.py->>sweep.py: inject n_embd=hidden_size into normalized_args
sweep.py->>json: load subblock_stats_list
json-->>sweep.py: subblock_stats_list
sweep.py->>filter_fn: filter_subblock_stats_by_args(list, normalized_args)
filter_fn-->>sweep.py: filtered_subblock_stats (unique match)
sweep.py-->>Caller: (subblock_stats, model_config)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelopt/torch/puzzletron/mip/sweep.py`:
- Around line 94-95: The code currently defaults subblock_stats.get("non_block",
{}).get("memory_mib", 0.0) (and similar for num_params) to zero which masks
malformed subblock_stats; change the logic in sweep.py to fail fast: check that
subblock_stats contains the "non_block" key and that "memory_mib" and
"num_params" exist (raise a clear ValueError or RuntimeError including subblock
identifier and the offending dict when missing) before computing
total_memory/teacher_memory, and apply the same existence checks for per-layer
accesses of block_stats[...] so malformed JSON surfaces immediately rather than
silently using zeros.
- Around line 47-48: The code assumes hydra_cfg.mip.subblock_stats_args is a
singleton but always uses index [0] (used by the teacher-total helpers and
run_mip_sweep), so add a fail-fast guard that asserts this is length 1 before
converting: check len(hydra_cfg.mip.subblock_stats_args) == 1 and raise a clear
ValueError (or AssertionError) if not, so the code in sweep.py (where
mip_subblock_args and subblock_stats_args are derived) explicitly enforces the
singleton assumption and prevents silently using the wrong scenario.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 176ee5b2-5938-4ce8-8d70-e6b89168b1a4
📒 Files selected for processing (2)
modelopt/torch/puzzletron/mip/sweep.pytests/gpu/torch/puzzletron/test_puzzletron.py
| mip_subblock_args = hydra_cfg.mip.subblock_stats_args[0] | ||
| batch_size = mip_subblock_args["batch_size"] | ||
| weights_dtype = str(mip_subblock_args["weights_dtype"]) | ||
| activations_dtype = str(mip_subblock_args["activations_dtype"]) | ||
| kv_cache_dtype = str(mip_subblock_args["kv_cache_dtype"]) | ||
| subblock_stats_args = OmegaConf.to_container(mip_subblock_args, resolve=True) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Assert the singleton subblock_stats_args assumption here.
Both teacher-total helpers always read hydra_cfg.mip.subblock_stats_args[0]. If a config ever carries more than one measurement scenario, these totals come from the wrong row and run_mip_sweep() will derive the wrong teacher baseline. Either thread the intended scenario through explicitly or fail fast unless this list is a singleton.
♻️ Suggested guard
- mip_subblock_args = hydra_cfg.mip.subblock_stats_args[0]
+ if len(hydra_cfg.mip.subblock_stats_args) != 1:
+ raise ValueError(
+ "Expected exactly one mip.subblock_stats_args entry; pass the desired scenario explicitly."
+ )
+ mip_subblock_args = hydra_cfg.mip.subblock_stats_args[0]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| mip_subblock_args = hydra_cfg.mip.subblock_stats_args[0] | |
| batch_size = mip_subblock_args["batch_size"] | |
| weights_dtype = str(mip_subblock_args["weights_dtype"]) | |
| activations_dtype = str(mip_subblock_args["activations_dtype"]) | |
| kv_cache_dtype = str(mip_subblock_args["kv_cache_dtype"]) | |
| subblock_stats_args = OmegaConf.to_container(mip_subblock_args, resolve=True) | |
| if len(hydra_cfg.mip.subblock_stats_args) != 1: | |
| raise ValueError( | |
| "Expected exactly one mip.subblock_stats_args entry; pass the desired scenario explicitly." | |
| ) | |
| mip_subblock_args = hydra_cfg.mip.subblock_stats_args[0] | |
| subblock_stats_args = OmegaConf.to_container(mip_subblock_args, resolve=True) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelopt/torch/puzzletron/mip/sweep.py` around lines 47 - 48, The code
assumes hydra_cfg.mip.subblock_stats_args is a singleton but always uses index
[0] (used by the teacher-total helpers and run_mip_sweep), so add a fail-fast
guard that asserts this is length 1 before converting: check
len(hydra_cfg.mip.subblock_stats_args) == 1 and raise a clear ValueError (or
AssertionError) if not, so the code in sweep.py (where mip_subblock_args and
subblock_stats_args are derived) explicitly enforces the singleton assumption
and prevents silently using the wrong scenario.
| total_memory = subblock_stats.get("non_block", {}).get("memory_mib", 0.0) | ||
|
|
There was a problem hiding this comment.
Fail fast on missing non_block metrics instead of treating them as zero.
teacher_memory drives the sweep target, so defaulting non_block.memory_mib / non_block.num_params to 0 turns malformed subblock_stats.json into a smaller baseline instead of surfacing the schema error. These keys look required here, just like the per-layer block_stats[...] accesses below.
♻️ Suggested guard
- total_memory = subblock_stats.get("non_block", {}).get("memory_mib", 0.0)
+ try:
+ total_memory = subblock_stats["non_block"]["memory_mib"]
+ except KeyError as e:
+ raise ValueError("Missing non_block.memory_mib in subblock_stats.json") from e- total_params = subblock_stats.get("non_block", {}).get("num_params", 0)
+ try:
+ total_params = subblock_stats["non_block"]["num_params"]
+ except KeyError as e:
+ raise ValueError("Missing non_block.num_params in subblock_stats.json") from eAlso applies to: 117-118
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelopt/torch/puzzletron/mip/sweep.py` around lines 94 - 95, The code
currently defaults subblock_stats.get("non_block", {}).get("memory_mib", 0.0)
(and similar for num_params) to zero which masks malformed subblock_stats;
change the logic in sweep.py to fail fast: check that subblock_stats contains
the "non_block" key and that "memory_mib" and "num_params" exist (raise a clear
ValueError or RuntimeError including subblock identifier and the offending dict
when missing) before computing total_memory/teacher_memory, and apply the same
existence checks for per-layer accesses of block_stats[...] so malformed JSON
surfaces immediately rather than silently using zeros.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## feature/puzzletron #1085 +/- ##
===================================================
Coverage 72.12% 72.12%
===================================================
Files 209 209
Lines 23628 23628
===================================================
Hits 17042 17042
Misses 6586 6586 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Integration tests for subblock stats (memory + num_of_params)
Summary by CodeRabbit
Refactor
Tests