-
Notifications
You must be signed in to change notification settings - Fork 240
CI: Move to self-hosted Windows GPU runners #958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
This comment has been minimized.
This comment has been minimized.
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
All green! Merge? |
|
btw we should backport this, but #955 needs to land first, after that I hope the backport bot should just work? (not sure) |
|
Migrate the Windows testing to use the new NV GHA runners. Cherry-pick NVIDIA#958.
* bump all CI jobs to CUDA 12.9.1 * CI: Consolidate test matrix configurations into ci/test-matrix.json with hard-coded values, optimized checkout, and prepared Windows self-hosted runner migration (#889) * Initial plan * Consolidate test matrices from workflows into ci/test-matrix.json Co-authored-by: leofang <[email protected]> * Hard-code all GPU and ARCH values in test-matrix.json with 6 fields per entry Co-authored-by: leofang <[email protected]> * Update Windows test matrix with a100 GPU and latest-1 driver, configure self-hosted runners Co-authored-by: leofang <[email protected]> * fix * Revert eed0b71 and change Windows DRIVER from latest-1 to latest Co-authored-by: leofang <[email protected]> * Add proxy cache setup to Windows workflow for self-hosted runners Co-authored-by: leofang <[email protected]> * Remove Git for Windows and gh CLI installation steps, add T4 GPU support to Windows matrix Co-authored-by: leofang <[email protected]> * Set fetch-depth: 1 for checkout steps and favor L4/T4 over A100 GPUs for Windows testing Co-authored-by: leofang <[email protected]> * Revert Windows workflow to GitHub-hosted runners with TODO comments for future self-hosted migration Co-authored-by: leofang <[email protected]> * [pre-commit.ci] auto code formatting * Revert Win runner name change for now --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: leofang <[email protected]> Co-authored-by: Leo Fang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * forgot to add windows * rerun codegen with 12.9.1 and update result/error explanations * First stab at the filter for CUDA < 13 in CI * Get data from the top-level array * Use the map function on select output * CI: Move to self-hosted Windows GPU runners Migrate the Windows testing to use the new NV GHA runners. Cherry-pick #958. --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: leofang <[email protected]> Co-authored-by: Leo Fang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Marcus D. Hanwell <[email protected]>
Description
Migrate the Windows testing to use the new NV GHA runners.
Checklist