E2E test for the experimental compress algorithm based on https://arxiv.org/abs/2411.19146#464
Conversation
using MIP-based NAS search algorithm. Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## feature/compress #464 +/- ##
=================================================
Coverage 73.40% 73.40%
=================================================
Files 180 180
Lines 18077 18077
=================================================
Hits 13270 13270
Misses 4807 4807 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
tests/gpu/torch/_compress/resources/configs/bypass/bypass_distillation_defaults.yaml
Outdated
Show resolved
Hide resolved
tests/gpu/torch/_compress/resources/configs/bypass/llama-3_1-8b_bypass.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Is the resources/tokenizer used as a toy tokenizer for testing instead of using original llama tokenizer?
We can instead re-use test toy models and tokenizers used in other tests. See comment below in gpu test file
There was a problem hiding this comment.
created an internal issue to address this in the next MR: issues/12
tests/experimental/torch/_compress/resources/tokenizer/truncate_tokenizer.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Unrelated to this PR, but do we also plan to simplify the yaml files as part of roadmap? Currently there are too many things to be configured and in too many yaml files, which we can move to one common base yaml hidden from users and only require user to provide 4-5 most important inputs to keep things simpler
There was a problem hiding this comment.
this is captured in the Nvidia internal roadmap
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…ation. Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
…tmp_path. Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
kevalmorabia97
left a comment
There was a problem hiding this comment.
Looks good to merge. Thanks for addressing my comments
What does this PR do?
Type of change: ?
new feature
Overview: ?
E2E test for the experimental compress algorithm based on https://arxiv.org/abs/2411.19146
Usage
See tests/gpu/torch/_compress/test_compress.py
# Add a code snippet demonstrating how to use thisSee tests/gpu/torch/_compress/test_compress.py
Testing
See tests/gpu/torch/_compress/test_compress.py
Before your PR is "Ready for review"
Additional Information