Skip to content

add: HF PTQ support and modelopt_recipes mount in launcher#1089

Merged
ChenhanYu merged 3 commits intomainfrom
chenhany/hf-ptq-launcher
Mar 25, 2026
Merged

add: HF PTQ support and modelopt_recipes mount in launcher#1089
ChenhanYu merged 3 commits intomainfrom
chenhany/hf-ptq-launcher

Conversation

@ChenhanYu
Copy link
Copy Markdown
Collaborator

@ChenhanYu ChenhanYu commented Mar 20, 2026

  • Add common/hf_ptq/hf_ptq.sh script for running hf_ptq.py directly
  • Add Qwen3-8B hf_ptq_local.yaml example config
  • Mount modelopt_recipes alongside modelopt in both Slurm and Docker executors so modelopt.recipe imports work with the overlay
  • Update default container to tensorrt-llm/release:1.3.0rc2

What does this PR do?

Type of change: ?

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

  • Is this change backward compatible?: ✅ / ❌ / N/A
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
  • Did you write any new necessary tests?: ✅ / ❌ / N/A
  • Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Summary by CodeRabbit

  • New Features

    • Added a Hugging Face PTQ workflow to run post-training quantization for models.
    • Added a local single-GPU pipeline for Qwen3-8B with fp8 quantization and export support.
    • Added a small command-line launcher to invoke the PTQ workflow with configurable model, quantization, calibration, and export options.
  • Chores

    • Updated the default runtime container image to a newer release.
    • Included model-optimization recipe files in container mounts.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a9ce2a5f-7e90-4a3e-a44f-3edbea52a85d

📥 Commits

Reviewing files that changed from the base of the PR and between f32d70d and 2d1b093.

📒 Files selected for processing (4)
  • tools/launcher/common/hf_ptq/hf_ptq.sh
  • tools/launcher/core.py
  • tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml
  • tools/launcher/slurm_config.py
✅ Files skipped from review due to trivial changes (2)
  • tools/launcher/core.py
  • tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • tools/launcher/slurm_config.py
  • tools/launcher/common/hf_ptq/hf_ptq.sh

📝 Walkthrough

Walkthrough

Adds a new HuggingFace PTQ Bash launcher and example job YAML, mounts model-optimizer recipes into Slurm/Docker executors, and updates the default Slurm container image tag.

Changes

Cohort / File(s) Summary
HF PTQ Launcher
tools/launcher/common/hf_ptq/hf_ptq.sh, tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml
New executable Bash launcher that sources utilities, registers error/exit handlers, sets HF PTQ defaults (HF_MODEL, QFORMAT, CALIB_SIZE, EXPORT_PATH), and runs hf_ptq.py forwarding args. Added a local job YAML to invoke the script for Qwen3-8B (single-node, single-GPU).
Container Mounts (Slurm & Docker)
tools/launcher/core.py
Executor builders now compute a modelopt_recipes_dst from slurm_config.modelopt_install_path, derive the corresponding source path in the per-job code tree, and append a src:dst bind mount/volume to container mounts for model-optimizer recipes.
Default Container Image
tools/launcher/slurm_config.py
Updated slurm_factory default container image from nvcr.io/nvidia/tensorrt-llm/release:1.2.0nvcr.io/nvidia/tensorrt-llm/release:1.3.0rc8.

Sequence Diagram(s)

sequenceDiagram
    participant Scheduler as Slurm/Docker Scheduler
    participant Job as Job Task (hf_ptq_local)
    participant Launcher as hf_ptq.sh
    participant Python as hf_ptq.py (in container)
    participant FS as Filesystem (HF model, modelopt_recipes, export)

    Scheduler->>Job: start task
    Job->>Launcher: execute `common/hf_ptq/hf_ptq.sh`
    Launcher->>Launcher: source utils, set env defaults, register traps
    Launcher->>FS: rely on modelopt_recipes bind-mounted into container
    Launcher->>Python: run `PYTHONPATH=${HF_PTQ_DIR} python hf_ptq.py` with args
    Python->>FS: read HF model, load recipes, run calibration
    Python->>FS: write exported model to EXPORT_PATH
    Python-->>Launcher: exit status
    Launcher->>Scheduler: report_result "PASS/FAIL"
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'add: HF PTQ support and modelopt_recipes mount in launcher' accurately summarizes the main changes: introducing HF PTQ support and adding modelopt_recipes mount configuration to the launcher.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Security Anti-Patterns ✅ Passed The pull request introduces no security anti-patterns in Python changes. Safe path construction and version updates are used without critical security issues.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chenhany/hf-ptq-launcher

Comment @coderabbitai help to get the list of available commands and usage tips.

@ChenhanYu ChenhanYu requested a review from cjluo-nv March 20, 2026 23:36
@ChenhanYu ChenhanYu self-assigned this Mar 20, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
tools/launcher/core.py (1)

302-311: Consider adding existence check for modelopt_recipes directory.

The Docker executor derives modelopt_recipes_src_path from the parent of modelopt_src_path, but doesn't verify the directory exists. If modelopt_recipes is missing locally, the container will fail to start with a confusing mount error.

🛡️ Proposed defensive check
     modelopt_recipes_src_path = os.path.join(os.path.dirname(modelopt_src_path), "modelopt_recipes")
+    if not os.path.isdir(modelopt_recipes_src_path):
+        raise FileNotFoundError(
+            f"modelopt_recipes directory not found at {modelopt_recipes_src_path}. "
+            "Ensure the Model-Optimizer submodule is fully initialized."
+        )
     exp_title_src = os.path.join(job_dir, experiment_title)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/launcher/core.py` around lines 302 - 311, The code builds
modelopt_recipes_src_path and unconditionally adds a mount for
modelopt_recipes_dst which will break container startup if the directory is
missing; update the logic around modelopt_recipes_src_path/modelopt_recipes_dst
to check os.path.exists(modelopt_recipes_src_path) (or create it with
os.makedirs(..., exist_ok=True) if desired) and only append the
f"{modelopt_recipes_src_path}:{modelopt_recipes_dst}" entry to container_mounts
when the directory exists (or after creating it), leaving other mounts
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tools/launcher/common/hf_ptq/hf_ptq.sh`:
- Line 19: The source command uses an unquoted variable expansion (SCRIPT_DIR)
which can break when the path contains spaces; update the invocation in
hf_ptq.sh to quote the expansion (use " ${SCRIPT_DIR}/../service_utils.sh " with
proper quoting around the whole path) so word splitting is prevented and the
script reliably sources service_utils.sh even if SCRIPT_DIR contains spaces.
- Around line 32-37: The command invoking hf_ptq.py in hf_ptq.sh uses unquoted
variable expansions (${HF_MODEL}, ${QFORMAT}, ${CALIB_SIZE}, ${EXPORT_PATH})
which can break when paths or values contain spaces; update the python
invocation to quote each expansion (e.g., "${HF_MODEL}", "${QFORMAT}",
"${CALIB_SIZE}", "${EXPORT_PATH}") while keeping the already-quoted "$@" so all
arguments are passed intact; modify the hf_ptq.sh call that runs hf_ptq.py to
use these quoted variables.
- Line 25: The script hf_ptq.sh currently runs `cd
modules/Model-Optimizer/examples/llm_ptq` without checking failure; update the
script to detect a failed cd and abort before running hf_ptq.py by testing the
exit status and printing a clear error message (e.g., "Failed to change
directory to modules/Model-Optimizer/examples/llm_ptq") and exiting non‑zero;
ensure this check occurs immediately after the cd so subsequent commands (like
invoking hf_ptq.py) never run in the wrong directory.

In `@tools/launcher/slurm_config.py`:
- Line 56: The default container variable container currently points to the RC
image "nvcr.io/nvidia/tensorrt-llm/release:1.3.0rc2"; either update this to the
stable image (e.g., "nvcr.io/nvidia/tensorrt-llm/release:1.0.0") or add a
concise inline comment next to the container assignment explaining exactly why
the 1.3.0rc2 release candidate is required (mention feature/bugfix and
compatibility constraints), and ensure the comment references the container
string and any related runtime dependency so reviewers can verify the rationale.

---

Nitpick comments:
In `@tools/launcher/core.py`:
- Around line 302-311: The code builds modelopt_recipes_src_path and
unconditionally adds a mount for modelopt_recipes_dst which will break container
startup if the directory is missing; update the logic around
modelopt_recipes_src_path/modelopt_recipes_dst to check
os.path.exists(modelopt_recipes_src_path) (or create it with os.makedirs(...,
exist_ok=True) if desired) and only append the
f"{modelopt_recipes_src_path}:{modelopt_recipes_dst}" entry to container_mounts
when the directory exists (or after creating it), leaving other mounts
unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d1574a92-50f8-4a7f-b078-e06f67ccf17d

📥 Commits

Reviewing files that changed from the base of the PR and between c3a3975 and 6df6d3d38425beb841406c96eea0abe1d261029d.

📒 Files selected for processing (4)
  • tools/launcher/common/hf_ptq/hf_ptq.sh
  • tools/launcher/core.py
  • tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml
  • tools/launcher/slurm_config.py

trap 'exit_handler' EXIT
###################################################################################################

cd modules/Model-Optimizer/examples/llm_ptq
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle cd failure to avoid running commands in the wrong directory.

If the cd fails (e.g., directory doesn't exist), the script continues execution in the current directory, which could lead to running hf_ptq.py from an unexpected location or failing with a confusing error.

🔧 Proposed fix
-cd modules/Model-Optimizer/examples/llm_ptq
+cd modules/Model-Optimizer/examples/llm_ptq || exit 1
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cd modules/Model-Optimizer/examples/llm_ptq
cd modules/Model-Optimizer/examples/llm_ptq || exit 1
🧰 Tools
🪛 Shellcheck (0.11.0)

[warning] 25-25: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/launcher/common/hf_ptq/hf_ptq.sh` at line 25, The script hf_ptq.sh
currently runs `cd modules/Model-Optimizer/examples/llm_ptq` without checking
failure; update the script to detect a failed cd and abort before running
hf_ptq.py by testing the exit status and printing a clear error message (e.g.,
"Failed to change directory to modules/Model-Optimizer/examples/llm_ptq") and
exiting non‑zero; ensure this check occurs immediately after the cd so
subsequent commands (like invoking hf_ptq.py) never run in the wrong directory.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.21%. Comparing base (d698864) to head (2d1b093).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1089      +/-   ##
==========================================
+ Coverage   70.18%   70.21%   +0.02%     
==========================================
  Files         228      228              
  Lines       25952    25952              
==========================================
+ Hits        18215    18221       +6     
+ Misses       7737     7731       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ChenhanYu ChenhanYu force-pushed the chenhany/hf-ptq-launcher branch from 6df6d3d to e6f066f Compare March 21, 2026 00:24
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tools/launcher/core.py`:
- Around line 238-240: The path for modelopt_recipes_dst is computed using
os.path.dirname(slurm_config.modelopt_install_path) which can return an
incorrect parent when modelopt_install_path ends with a trailing slash;
normalize the path first by calling an appropriate normalization function (e.g.,
os.path.normpath or similar) on slurm_config.modelopt_install_path before
passing it to os.path.dirname so modelopt_recipes_dst is always the intended
parent directory (apply the same normalization wherever modelopt_install_path is
used to construct mount destinations such as the other occurrence around
modelopt_recipes).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 16976c0a-e4ed-48a9-87f8-7436754c49ac

📥 Commits

Reviewing files that changed from the base of the PR and between 6df6d3d38425beb841406c96eea0abe1d261029d and e6f066f.

📒 Files selected for processing (4)
  • tools/launcher/common/hf_ptq/hf_ptq.sh
  • tools/launcher/core.py
  • tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml
  • tools/launcher/slurm_config.py
✅ Files skipped from review due to trivial changes (1)
  • tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • tools/launcher/slurm_config.py
  • tools/launcher/common/hf_ptq/hf_ptq.sh

Copy link
Copy Markdown
Collaborator

@kevalmorabia97 kevalmorabia97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left 2 minor comments. Otherwise LGTM

@ChenhanYu
Copy link
Copy Markdown
Collaborator Author

/ok to test f32d70d

@ChenhanYu ChenhanYu force-pushed the chenhany/hf-ptq-launcher branch from e6f066f to f32d70d Compare March 24, 2026 22:47
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Mar 24, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 24, 2026

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-03-25 03:14 UTC

ChenhanYu and others added 3 commits March 24, 2026 16:05
- Add common/hf_ptq/hf_ptq.sh script for running hf_ptq.py directly
- Add Qwen3-8B hf_ptq_local.yaml example config
- Mount modelopt_recipes alongside modelopt in both Slurm and Docker
  executors so modelopt.recipe imports work with the overlay
- Update default container to tensorrt-llm/release:1.3.0rc2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Chenhan Yu <chenhany@nvidia.com>
Signed-off-by: Chenhan Yu <chenhany@nvidia.com>
- Quote variable expansion in hf_ptq.sh source command
- Use os.path.normpath() before os.path.dirname() to handle trailing slashes in modelopt_install_path
- Upgrade default container from 1.3.0rc2 to 1.3.0rc8

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Chenhan Yu <chenhany@nvidia.com>
@ChenhanYu ChenhanYu force-pushed the chenhany/hf-ptq-launcher branch from f32d70d to 2d1b093 Compare March 24, 2026 23:07
@ChenhanYu ChenhanYu merged commit ae965a9 into main Mar 25, 2026
38 checks passed
@ChenhanYu ChenhanYu deleted the chenhany/hf-ptq-launcher branch March 25, 2026 03:13
RinZ27 pushed a commit to RinZ27/Model-Optimizer that referenced this pull request Mar 27, 2026
- Add common/hf_ptq/hf_ptq.sh script for running hf_ptq.py directly
- Add Qwen3-8B hf_ptq_local.yaml example config
- Mount modelopt_recipes alongside modelopt in both Slurm and Docker
executors so modelopt.recipe imports work with the overlay
- Update default container to tensorrt-llm/release:1.3.0rc2

### What does this PR do?

Type of change: ? <!-- Use one of the following: Bug fix, new feature,
new example, new tests, documentation. -->

<!-- Details about the change. -->

### Usage

```python
# Add a code snippet demonstrating how to use this
```

### Testing
<!-- Mention how have you tested your change if applicable. -->

### Before your PR is "*Ready for review*"

Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)
and your commits are signed (`git commit -s -S`).

Make sure you read and follow the [Security Best
Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors)
(e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(...,
weights_only=False)`, `pickle`, etc.).

- Is this change backward compatible?: ✅ / ❌ / N/A <!--- If ❌, explain
why. -->
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ / ❌ / N/A
<!--- Mandatory -->
- Did you write any new necessary tests?: ✅ / ❌ / N/A <!--- Mandatory
for new features or examples. -->
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
✅ / ❌ / N/A <!--- Only for new features, API changes, critical bug fixes
or backward incompatible changes. -->

### Additional Information
<!-- E.g. related issue. -->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a Hugging Face PTQ workflow to run post-training quantization
for models.
* Added a local single-GPU pipeline for Qwen3-8B with fp8 quantization
and export support.
* Added a small command-line launcher to invoke the PTQ workflow with
configurable model, quantization, calibration, and export options.

* **Chores**
  * Updated the default runtime container image to a newer release.
  * Included model-optimization recipe files in container mounts.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Chenhan Yu <chenhany@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants