-
Notifications
You must be signed in to change notification settings - Fork 332
E2E test for the experimental compress algorithm based on https://arxiv.org/abs/2411.19146 #464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
kevalmorabia97
merged 19 commits into
feature/compress
from
dkorzekwa/e2e_compression_test
Oct 28, 2025
Merged
Changes from all commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
c758ad5
The main compression function for a model
danielkorzekwa 8af9903
Code formatting
danielkorzekwa 5ba6c27
Model search space configuration used by test_compress.py test.
danielkorzekwa 0bc5d84
Tokenizer used by test_compress.py test.
danielkorzekwa 87d4fa5
Tokenizer utility used by test_compress.py test
danielkorzekwa ced1e99
e2e tests for compress.py
danielkorzekwa 800414c
Remove unused bypass distillation config files.
danielkorzekwa 16abcc9
Moving integration tests to tests/experimental to not trigger CICD
danielkorzekwa a5ba1c7
update docs
danielkorzekwa 1bda391
Replace mprint with print and replace osp.join with path1 / path2 not…
danielkorzekwa bb38401
Refactor file checking assertions to use .is_file() and .exists()
danielkorzekwa d4ffc91
Merge branch 'feature/compress' into dkorzekwa/e2e_compression_test
kevalmorabia97 6f28e4a
Fix: Add missing LICENSE headers
kevalmorabia97 016fb63
Use spawn_multiprocess_job for test_compress test (to be able to use …
danielkorzekwa 0ccf1c4
Add comments.
danielkorzekwa 58439ca
Add _save_dummy_dataset to the test_compress.py
danielkorzekwa 2e5f776
Refactoring: Move torch distributed env variables to dist_utils.py
danielkorzekwa 6274db5
Refactoring: move torch distributed variables to dist_utils
danielkorzekwa d942e0a
Move os.environ["WANDB_DISABLED"] = "true" to dist_utils.py
danielkorzekwa File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| Experimental model compression algorithm based on a Local Neural Architecture Search. | ||
| Based on the Puzzle paper: <https://arxiv.org/abs/2411.19146> | ||
| PoC for Llama 3.1 model. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,82 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| """ | ||
|
|
||
| This module provides the main compression function for a model | ||
| using MIP-based NAS search algorithm. | ||
|
|
||
| """ | ||
|
|
||
| import build_library_and_stats | ||
| import mip_and_realize_models | ||
| import pruning_ckpts | ||
| import score_pruning_activations | ||
| import scoring | ||
| from omegaconf import DictConfig | ||
| from puzzle_tools.runtime import IRuntime | ||
|
|
||
| # TODO Move initialize_hydra_config_for_dir from tests to main | ||
| from tests.utils.test_utils import initialize_hydra_config_for_dir | ||
|
|
||
|
|
||
| def compress( | ||
| hydra_config_dir: str, hydra_config: str, puzzle_dir: str, dataset_path: str, runtime: IRuntime | ||
| ) -> DictConfig: | ||
| """Compress a puzzletron model using the MIP-based NAS search algorithm. | ||
|
|
||
| Args: | ||
| hydra_config_dir (str): path to a hydra_config_dir that defines the search space | ||
| hydra_config (str): the corresponding hydra config file | ||
| puzzle_dir (str): directory with a puzzletron model to compress | ||
| dataset_path (str): dataset used for scoring and distillation | ||
| runtime: distributed runtime to use to run the compression steps, e.g., | ||
| NativeDdpRuntime(dtype=torch.bfloat16, torch_distributed_timeout=datetime.timedelta(10)) | ||
|
|
||
| Returns: | ||
| Hydra config object after compressing the model. | ||
| The same hydra configuration object is used across all compression steps. | ||
| @TODO: Investigate if this config object is immutable across steps and clarify | ||
| """ | ||
| # Step 0: Load puzzletron hydra config | ||
| hydra_cfg = initialize_hydra_config_for_dir( | ||
| config_dir=hydra_config_dir, | ||
| config_name=hydra_config, | ||
| overrides=[ | ||
| f"puzzle_dir={puzzle_dir}", | ||
| f"dataset_path={dataset_path}", | ||
| ], | ||
| ) | ||
|
|
||
| # Step 1: score_pruning_activations (distributed processing) | ||
| score_pruning_activations.launch_score_activations(hydra_cfg, runtime) | ||
|
|
||
| # Step 2: pruning_ckpts (single process) | ||
| if runtime.global_rank == 0: | ||
| pruning_ckpts.launch_prune_ckpt(hydra_cfg) | ||
| runtime.wait_for_everyone() | ||
kevalmorabia97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # Step 4: build_library_and_stats (single process) | ||
| if runtime.global_rank == 0: | ||
| build_library_and_stats.launch_build_library_and_stats(hydra_cfg) | ||
| runtime.wait_for_everyone() | ||
kevalmorabia97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # Step 5: calc_one_block_scores (distributed processing) | ||
| scoring.launch_scoring(hydra_cfg, runtime) | ||
|
|
||
| # Step 6: mip_and_realize_models (distributed processing) | ||
| mip_and_realize_models.launch_mip_and_realize_model(hydra_cfg, runtime) | ||
|
|
||
| return hydra_cfg | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.