Skip to content

Conversation

@ChrisRackauckas-Claude
Copy link
Contributor

Summary

Fixes SciML/NonlinearSolve.jl#746

When using DefaultLinearSolver with non-square GPU matrices (e.g., for least squares problems in NonlinearSolve.jl), the init_cacheval function for CholeskyFactorization would fail with DimensionMismatch: matrix is not square because it tried to compute cholesky(A) on a non-square matrix.

Root Cause

The @generated init_cacheval for DefaultLinearSolver initializes caches for all possible algorithms upfront, including CholeskyFactorization. For non-square GPU matrices, this fails even though CholeskyFactorization would never actually be used (the defaultalg function correctly selects QRFactorization for non-square matrices).

Fix

The fix checks assumptions.issq before attempting Cholesky factorization in init_cacheval(::CholeskyFactorization, ::GPUArraysCore.AnyGPUArray, ...) and returns nothing for non-square matrices, allowing the cache initialization to succeed.

Changes

  • src/factorization.jl: Added square matrix check in init_cacheval for CholeskyFactorization with GPU arrays
  • test/gpu/cuda.jl: Added tests for non-square GPU matrices (overdetermined and underdetermined systems)

Test plan

  • LinearSolve package loads correctly
  • CI tests pass
  • New CUDA tests for non-square matrices pass

🤖 Generated with Claude Code

When using DefaultLinearSolver with non-square GPU matrices (e.g., for
least squares problems), the init_cacheval function for CholeskyFactorization
would fail because it tried to compute cholesky(A) on a non-square matrix.

The fix checks assumptions.issq before attempting Cholesky factorization
and returns nothing for non-square matrices, allowing the DefaultLinearSolver
to properly use QRFactorization instead.

Fixes SciML/NonlinearSolve.jl#746

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@ChrisRackauckas ChrisRackauckas force-pushed the fix-gpu-nonsquare-cholesky-init branch from 91c9aec to 854d90f Compare December 16, 2025 21:17
Add check for assumptions.issq in the SparseArrays extension's
NormalCholeskyFactorization init_cacheval to avoid DimensionMismatch
errors when DefaultLinearSolver initializes caches for non-square
GPU matrices.

This is part of the fix for SciML/NonlinearSolve.jl#746

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@ChrisRackauckas-Claude
Copy link
Contributor Author

Added a second fix for the NormalCholeskyFactorization init_cacheval function in ext/LinearSolveSparseArraysExt.jl.

The previous commit fixed CholeskyFactorization in src/factorization.jl, but the GPU tests still failed because the NormalCholeskyFactorization dispatch in the SparseArrays extension was calling ArrayInterface.cholesky_instance(convert(AbstractMatrix, A)) directly on the matrix A without checking if it's square.

This commit adds the same assumptions.issq check to avoid the DimensionMismatch error during DefaultLinearSolver cache initialization for non-square GPU matrices.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@ChrisRackauckas-Claude
Copy link
Contributor Author

CI Status Update

The key downstream tests are passing:

  • ✅ NonlinearSolve.jl/Core/lts
  • ✅ SciMLSensitivity.jl/Core1/lts
  • ✅ Spell Check with Typos

The failing tests appear to be pre-existing issues:

  • Format Check - The SciML format checker checks the entire repo, not just changed files. The changed files have been formatted with SciMLStyle.
  • Core alldeps (1.10) - Package resolution failure between ChainRulesCore (pinned to 1.25.0) and Zygote (requires >=1.25.1). This is a dependency issue, not related to this PR.

GPU tests run on Buildkite separately and will validate the fix for non-square GPU matrices.

The previous change incorrectly returned nothing for ALL non-square
matrices (sparse and GPU), but NormalCholeskyFactorization actually
works with non-square matrices because it factors A'*A which is square.

The issue was only with GPU arrays where ArrayInterface.cholesky_instance
fails for non-square. Sparse CPU arrays handle this correctly.

Now the check only applies to GPUArraysCore.AnyGPUArray, allowing sparse
CPU matrices to use the original cholesky_instance path.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@ChrisRackauckas-Claude
Copy link
Contributor Author

Fix for Non-Square Tests Failure

The previous change incorrectly returned nothing for ALL non-square matrices in NormalCholeskyFactorization init_cacheval. However, NormalCholeskyFactorization actually works with non-square matrices because it factors A' * A (which is always square).

The issue was only with GPU arrays where ArrayInterface.cholesky_instance(A) fails for non-square A. Sparse CPU arrays handle cholesky_instance correctly for non-square matrices.

Fix: Changed the condition to only apply to GPUArraysCore.AnyGPUArray, allowing sparse CPU matrices to use the original cholesky_instance path.

# Before (incorrect - broke sparse CPU):
elseif !assumptions.issq
    nothing

# After (correct - only affects GPU):
elseif A isa LinearSolve.GPUArraysCore.AnyGPUArray && !assumptions.issq
    nothing

@ChrisRackauckas-Claude
Copy link
Contributor Author

✅ All Core Tests Passing

After the fix to only apply the non-square check to GPU arrays (not sparse CPU arrays), all Core tests are now passing:

  • ✅ 14/14 Tests (*, *, Core, *) - ALL PASSING
  • ✅ NonlinearSolve.jl/Core/lts
  • ✅ SciMLSensitivity.jl/Core1/lts

The only remaining failures are pre-existing infrastructure issues:

  • Format Check (repo-wide formatting)
  • test (Core, alldeps, 1.10) (ChainRulesCore/Zygote dependency resolution)

Summary of changes:

  1. src/factorization.jl: Added assumptions.issq check for CholeskyFactorization init_cacheval with GPU arrays
  2. ext/LinearSolveSparseArraysExt.jl: Added AnyGPUArray && !assumptions.issq check for NormalCholeskyFactorization init_cacheval (GPU arrays only, not sparse CPU)
  3. test/gpu/cuda.jl: Added tests for non-square GPU matrices

The fix allows DefaultLinearSolver to properly initialize caches for non-square GPU matrices by returning nothing for Cholesky-based cachevals when the matrix is not square, allowing the solver to fall back to QRFactorization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nonlinear least squares CuArrays support for arbitrary residual shape

2 participants