add: HF PTQ support and modelopt_recipes mount in launcher by ChenhanYu · Pull Request #1089 · NVIDIA/Model-Optimizer

ChenhanYu · 2026-03-20T23:35:59Z

Add common/hf_ptq/hf_ptq.sh script for running hf_ptq.py directly
Add Qwen3-8B hf_ptq_local.yaml example config
Mount modelopt_recipes alongside modelopt in both Slurm and Docker executors so modelopt.recipe imports work with the overlay
Update default container to tensorrt-llm/release:1.3.0rc2

What does this PR do?

Type of change: ?

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Summary by CodeRabbit

New Features
- Added a Hugging Face PTQ workflow to run post-training quantization for models.
- Added a local single-GPU pipeline for Qwen3-8B with fp8 quantization and export support.
- Added a small command-line launcher to invoke the PTQ workflow with configurable model, quantization, calibration, and export options.
Chores
- Updated the default runtime container image to a newer release.
- Included model-optimization recipe files in container mounts.

coderabbitai · 2026-03-20T23:36:13Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a9ce2a5f-7e90-4a3e-a44f-3edbea52a85d

📥 Commits

Reviewing files that changed from the base of the PR and between f32d70d and 2d1b093.

📒 Files selected for processing (4)

tools/launcher/common/hf_ptq/hf_ptq.sh
tools/launcher/core.py
tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml
tools/launcher/slurm_config.py

✅ Files skipped from review due to trivial changes (2)

tools/launcher/core.py
tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml

🚧 Files skipped from review as they are similar to previous changes (2)

tools/launcher/slurm_config.py
tools/launcher/common/hf_ptq/hf_ptq.sh

📝 Walkthrough

Walkthrough

Adds a new HuggingFace PTQ Bash launcher and example job YAML, mounts model-optimizer recipes into Slurm/Docker executors, and updates the default Slurm container image tag.

Changes

Cohort / File(s)	Summary
HF PTQ Launcher `tools/launcher/common/hf_ptq/hf_ptq.sh`, `tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml`	New executable Bash launcher that sources utilities, registers error/exit handlers, sets HF PTQ defaults (`HF_MODEL`, `QFORMAT`, `CALIB_SIZE`, `EXPORT_PATH`), and runs `hf_ptq.py` forwarding args. Added a local job YAML to invoke the script for Qwen3-8B (single-node, single-GPU).
Container Mounts (Slurm & Docker) `tools/launcher/core.py`	Executor builders now compute a `modelopt_recipes_dst` from `slurm_config.modelopt_install_path`, derive the corresponding source path in the per-job code tree, and append a `src:dst` bind mount/volume to container mounts for model-optimizer recipes.
Default Container Image `tools/launcher/slurm_config.py`	Updated `slurm_factory` default container image from `nvcr.io/nvidia/tensorrt-llm/release:1.2.0` → `nvcr.io/nvidia/tensorrt-llm/release:1.3.0rc8`.

Sequence Diagram(s)

sequenceDiagram
    participant Scheduler as Slurm/Docker Scheduler
    participant Job as Job Task (hf_ptq_local)
    participant Launcher as hf_ptq.sh
    participant Python as hf_ptq.py (in container)
    participant FS as Filesystem (HF model, modelopt_recipes, export)

    Scheduler->>Job: start task
    Job->>Launcher: execute `common/hf_ptq/hf_ptq.sh`
    Launcher->>Launcher: source utils, set env defaults, register traps
    Launcher->>FS: rely on modelopt_recipes bind-mounted into container
    Launcher->>Python: run `PYTHONPATH=${HF_PTQ_DIR} python hf_ptq.py` with args
    Python->>FS: read HF model, load recipes, run calibration
    Python->>FS: write exported model to EXPORT_PATH
    Python-->>Launcher: exit status
    Launcher->>Scheduler: report_result "PASS/FAIL"

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'add: HF PTQ support and modelopt_recipes mount in launcher' accurately summarizes the main changes: introducing HF PTQ support and adding modelopt_recipes mount configuration to the launcher.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Security Anti-Patterns	✅ Passed	The pull request introduces no security anti-patterns in Python changes. Safe path construction and version updates are used without critical security issues.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch chenhany/hf-ptq-launcher

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

tools/launcher/core.py (1)

302-311: Consider adding existence check for modelopt_recipes directory.

The Docker executor derives modelopt_recipes_src_path from the parent of modelopt_src_path, but doesn't verify the directory exists. If modelopt_recipes is missing locally, the container will fail to start with a confusing mount error.

🛡️ Proposed defensive check

     modelopt_recipes_src_path = os.path.join(os.path.dirname(modelopt_src_path), "modelopt_recipes")
+    if not os.path.isdir(modelopt_recipes_src_path):
+        raise FileNotFoundError(
+            f"modelopt_recipes directory not found at {modelopt_recipes_src_path}. "
+            "Ensure the Model-Optimizer submodule is fully initialized."
+        )
     exp_title_src = os.path.join(job_dir, experiment_title)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tools/launcher/core.py` around lines 302 - 311, The code builds
modelopt_recipes_src_path and unconditionally adds a mount for
modelopt_recipes_dst which will break container startup if the directory is
missing; update the logic around modelopt_recipes_src_path/modelopt_recipes_dst
to check os.path.exists(modelopt_recipes_src_path) (or create it with
os.makedirs(..., exist_ok=True) if desired) and only append the
f"{modelopt_recipes_src_path}:{modelopt_recipes_dst}" entry to container_mounts
when the directory exists (or after creating it), leaving other mounts
unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tools/launcher/common/hf_ptq/hf_ptq.sh`:
- Line 19: The source command uses an unquoted variable expansion (SCRIPT_DIR)
which can break when the path contains spaces; update the invocation in
hf_ptq.sh to quote the expansion (use " ${SCRIPT_DIR}/../service_utils.sh " with
proper quoting around the whole path) so word splitting is prevented and the
script reliably sources service_utils.sh even if SCRIPT_DIR contains spaces.
- Around line 32-37: The command invoking hf_ptq.py in hf_ptq.sh uses unquoted
variable expansions (${HF_MODEL}, ${QFORMAT}, ${CALIB_SIZE}, ${EXPORT_PATH})
which can break when paths or values contain spaces; update the python
invocation to quote each expansion (e.g., "${HF_MODEL}", "${QFORMAT}",
"${CALIB_SIZE}", "${EXPORT_PATH}") while keeping the already-quoted "$@" so all
arguments are passed intact; modify the hf_ptq.sh call that runs hf_ptq.py to
use these quoted variables.
- Line 25: The script hf_ptq.sh currently runs `cd
modules/Model-Optimizer/examples/llm_ptq` without checking failure; update the
script to detect a failed cd and abort before running hf_ptq.py by testing the
exit status and printing a clear error message (e.g., "Failed to change
directory to modules/Model-Optimizer/examples/llm_ptq") and exiting non‑zero;
ensure this check occurs immediately after the cd so subsequent commands (like
invoking hf_ptq.py) never run in the wrong directory.

In `@tools/launcher/slurm_config.py`:
- Line 56: The default container variable container currently points to the RC
image "nvcr.io/nvidia/tensorrt-llm/release:1.3.0rc2"; either update this to the
stable image (e.g., "nvcr.io/nvidia/tensorrt-llm/release:1.0.0") or add a
concise inline comment next to the container assignment explaining exactly why
the 1.3.0rc2 release candidate is required (mention feature/bugfix and
compatibility constraints), and ensure the comment references the container
string and any related runtime dependency so reviewers can verify the rationale.

---

Nitpick comments:
In `@tools/launcher/core.py`:
- Around line 302-311: The code builds modelopt_recipes_src_path and
unconditionally adds a mount for modelopt_recipes_dst which will break container
startup if the directory is missing; update the logic around
modelopt_recipes_src_path/modelopt_recipes_dst to check
os.path.exists(modelopt_recipes_src_path) (or create it with os.makedirs(...,
exist_ok=True) if desired) and only append the
f"{modelopt_recipes_src_path}:{modelopt_recipes_dst}" entry to container_mounts
when the directory exists (or after creating it), leaving other mounts
unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d1574a92-50f8-4a7f-b078-e06f67ccf17d

📥 Commits

Reviewing files that changed from the base of the PR and between c3a3975 and 6df6d3d38425beb841406c96eea0abe1d261029d.

📒 Files selected for processing (4)

tools/launcher/common/hf_ptq/hf_ptq.sh
tools/launcher/core.py
tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml
tools/launcher/slurm_config.py

tools/launcher/common/hf_ptq/hf_ptq.sh

coderabbitai · 2026-03-20T23:41:51Z

tools/launcher/common/hf_ptq/hf_ptq.sh

+trap 'exit_handler' EXIT
+###################################################################################################
+
+cd modules/Model-Optimizer/examples/llm_ptq


⚠️ Potential issue | 🟡 Minor

Handle cd failure to avoid running commands in the wrong directory.

If the cd fails (e.g., directory doesn't exist), the script continues execution in the current directory, which could lead to running hf_ptq.py from an unexpected location or failing with a confusing error.

🔧 Proposed fix

-cd modules/Model-Optimizer/examples/llm_ptq +cd modules/Model-Optimizer/examples/llm_ptq || exit 1

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cd modules/Model-Optimizer/examples/llm_ptq

cd modules/Model-Optimizer/examples/llm_ptq || exit 1

🧰 Tools

🪛 Shellcheck (0.11.0)

[warning] 25-25: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tools/launcher/common/hf_ptq/hf_ptq.sh` at line 25, The script hf_ptq.sh currently runs `cd modules/Model-Optimizer/examples/llm_ptq` without checking failure; update the script to detect a failed cd and abort before running hf_ptq.py by testing the exit status and printing a clear error message (e.g., "Failed to change directory to modules/Model-Optimizer/examples/llm_ptq") and exiting non‑zero; ensure this check occurs immediately after the cd so subsequent commands (like invoking hf_ptq.py) never run in the wrong directory.

tools/launcher/common/hf_ptq/hf_ptq.sh

tools/launcher/slurm_config.py

codecov · 2026-03-20T23:47:46Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.21%. Comparing base (d698864) to head (2d1b093).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1089      +/-   ##
==========================================
+ Coverage   70.18%   70.21%   +0.02%     
==========================================
  Files         228      228              
  Lines       25952    25952              
==========================================
+ Hits        18215    18221       +6     
+ Misses       7737     7731       -6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tools/launcher/core.py`:
- Around line 238-240: The path for modelopt_recipes_dst is computed using
os.path.dirname(slurm_config.modelopt_install_path) which can return an
incorrect parent when modelopt_install_path ends with a trailing slash;
normalize the path first by calling an appropriate normalization function (e.g.,
os.path.normpath or similar) on slurm_config.modelopt_install_path before
passing it to os.path.dirname so modelopt_recipes_dst is always the intended
parent directory (apply the same normalization wherever modelopt_install_path is
used to construct mount destinations such as the other occurrence around
modelopt_recipes).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 16976c0a-e4ed-48a9-87f8-7436754c49ac

📥 Commits

Reviewing files that changed from the base of the PR and between 6df6d3d38425beb841406c96eea0abe1d261029d and e6f066f.

📒 Files selected for processing (4)

tools/launcher/common/hf_ptq/hf_ptq.sh
tools/launcher/core.py
tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml
tools/launcher/slurm_config.py

✅ Files skipped from review due to trivial changes (1)

tools/launcher/examples/Qwen/Qwen3-8B/hf_ptq_local.yaml

🚧 Files skipped from review as they are similar to previous changes (2)

tools/launcher/slurm_config.py
tools/launcher/common/hf_ptq/hf_ptq.sh

tools/launcher/core.py

tools/launcher/common/hf_ptq/hf_ptq.sh

tools/launcher/slurm_config.py

kevalmorabia97

Left 2 minor comments. Otherwise LGTM

ChenhanYu · 2026-03-24T22:47:55Z

/ok to test f32d70d

copy-pr-bot · 2026-03-24T22:47:59Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2026-03-24T22:52:24Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-03-25 03:14 UTC

- Add common/hf_ptq/hf_ptq.sh script for running hf_ptq.py directly - Add Qwen3-8B hf_ptq_local.yaml example config - Mount modelopt_recipes alongside modelopt in both Slurm and Docker executors so modelopt.recipe imports work with the overlay - Update default container to tensorrt-llm/release:1.3.0rc2 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Chenhan Yu <chenhany@nvidia.com>

Signed-off-by: Chenhan Yu <chenhany@nvidia.com>

- Quote variable expansion in hf_ptq.sh source command - Use os.path.normpath() before os.path.dirname() to handle trailing slashes in modelopt_install_path - Upgrade default container from 1.3.0rc2 to 1.3.0rc8 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Chenhan Yu <chenhany@nvidia.com>

- Add common/hf_ptq/hf_ptq.sh script for running hf_ptq.py directly - Add Qwen3-8B hf_ptq_local.yaml example config - Mount modelopt_recipes alongside modelopt in both Slurm and Docker executors so modelopt.recipe imports work with the overlay - Update default container to tensorrt-llm/release:1.3.0rc2 ### What does this PR do? Type of change: ?   ### Usage ```python # Add a code snippet demonstrating how to use this ``` ### Testing  ### Before your PR is "*Ready for review*" Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md) and your commits are signed (`git commit -s -S`). Make sure you read and follow the [Security Best Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors) (e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(..., weights_only=False)`, `pickle`, etc.). - Is this change backward compatible?: ✅ / ❌ / N/A  - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ / ❌ / N/A  - Did you write any new necessary tests?: ✅ / ❌ / N/A  - Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?: ✅ / ❌ / N/A  ### Additional Information   ## Summary by CodeRabbit * **New Features** * Added a Hugging Face PTQ workflow to run post-training quantization for models. * Added a local single-GPU pipeline for Qwen3-8B with fp8 quantization and export support. * Added a small command-line launcher to invoke the PTQ workflow with configurable model, quantization, calibration, and export options. * **Chores** * Updated the default runtime container image to a newer release. * Included model-optimization recipe files in container mounts.  --------- Signed-off-by: Chenhan Yu <chenhany@nvidia.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ChenhanYu requested a review from cjluo-nv March 20, 2026 23:36

ChenhanYu self-assigned this Mar 20, 2026

coderabbitai bot reviewed Mar 20, 2026

View reviewed changes

ChenhanYu force-pushed the chenhany/hf-ptq-launcher branch from 6df6d3d to e6f066f Compare March 21, 2026 00:24

coderabbitai bot reviewed Mar 21, 2026

View reviewed changes

tools/launcher/core.py Show resolved Hide resolved

ChenhanYu mentioned this pull request Mar 23, 2026

Run hf ptq on computelab #1100

Draft

ChenhanYu requested review from AAnoosheh and kevalmorabia97 March 24, 2026 00:07

AAnoosheh reviewed Mar 24, 2026

View reviewed changes

tools/launcher/common/hf_ptq/hf_ptq.sh Show resolved Hide resolved

kevalmorabia97 reviewed Mar 24, 2026

View reviewed changes

tools/launcher/slurm_config.py Outdated Show resolved Hide resolved

kevalmorabia97 approved these changes Mar 24, 2026

View reviewed changes

AAnoosheh approved these changes Mar 24, 2026

View reviewed changes

ChenhanYu force-pushed the chenhany/hf-ptq-launcher branch from e6f066f to f32d70d Compare March 24, 2026 22:47

ChenhanYu and others added 3 commits March 24, 2026 16:05

chg: date dateset

54d04a2

Signed-off-by: Chenhan Yu <chenhany@nvidia.com>

ChenhanYu force-pushed the chenhany/hf-ptq-launcher branch from f32d70d to 2d1b093 Compare March 24, 2026 23:07

ChenhanYu merged commit ae965a9 into main Mar 25, 2026
38 checks passed

ChenhanYu deleted the chenhany/hf-ptq-launcher branch March 25, 2026 03:13

	cd modules/Model-Optimizer/examples/llm_ptq
	cd modules/Model-Optimizer/examples/llm_ptq \|\| exit 1

Conversation

ChenhanYu commented Mar 20, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kevalmorabia97 left a comment

Choose a reason for hiding this comment

Uh oh!

ChenhanYu commented Mar 24, 2026

Uh oh!

copy-pr-bot bot commented Mar 24, 2026

Uh oh!

github-actions bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ChenhanYu commented Mar 20, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 20, 2026 •

edited

Loading

codecov bot commented Mar 20, 2026 •

edited

Loading

github-actions bot commented Mar 24, 2026 •

edited

Loading