-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix (l/r)mul!
with Diagonal
/Bidiagonal
#55052
Conversation
This is going back to where we came from, so it would be good to double check what the original concern was to make it call |
The problem is that One option might be to write out the loop here instead of broadcasting. My idea was that array types that specialise broadcasting would be able to access optimised methods. |
I found it: #42321. |
This goes over the entire matrix and misses all the optimizations that we have in |
Yes, we would need to specialize |
(l/r)mul!
with Diagonal
(l/r)mul!
with Diagonal
/Bidiagonal
Currently, `rmul!(A::AbstractMatirx, D::Diagonal)` calls `mul!(A, A, D)`, but this isn't a valid call, as `mul!` assumes no aliasing between the destination and the matrices to be multiplied. As a consequence, ```julia julia> B = Bidiagonal(rand(4), rand(3), :L) 4×4 Bidiagonal{Float64, Vector{Float64}}: 0.476892 ⋅ ⋅ ⋅ 0.353756 0.139188 ⋅ ⋅ ⋅ 0.685839 0.309336 ⋅ ⋅ ⋅ 0.369038 0.304273 julia> D = Diagonal(rand(size(B,2))); julia> rmul!(B, D) 4×4 Bidiagonal{Float64, Vector{Float64}}: 0.0 ⋅ ⋅ ⋅ 0.0 0.0 ⋅ ⋅ ⋅ 0.0 0.0 ⋅ ⋅ ⋅ 0.0 0.0 julia> B 4×4 Bidiagonal{Float64, Vector{Float64}}: 0.0 ⋅ ⋅ ⋅ 0.0 0.0 ⋅ ⋅ ⋅ 0.0 0.0 ⋅ ⋅ ⋅ 0.0 0.0 ``` This is clearly nonsense, and happens because the internal `_mul!` function assumes that it can safely overwrite the destination with zeros before carrying out the multiplication. This is fixed in this PR by using broadcasting instead. The current implementation is generally equally performant, albeit occasionally with a minor allocation arising from `reshape`ing an `Array`. A similar problem also exists in `l/rmul!` with `Bidiaognal`, but that's a little harder to fix while remaining equally performant. (cherry picked from commit 262b40a)
This seems to fail on CI on 1.11:
|
Backported PRs: - [x] #54201 <!-- Fix generic triangular solves with empty matrices --> - [x] #54358 <!-- Create `jl_clear_coverage_data` to dynamically reset coverage --> - [x] #54908 <!-- LazyString in interpolated error messages in threadingconstructs --> - [x] #54952 <!-- LAPACK: Avoid repr call in `chkvalidparam` --> - [x] #54898 <!-- fix concurrent module loading return value --> - [x] #55082 <!-- Add fast method for copyto!(::Memory, ::Memory) --> - [x] #55084 <!-- Use triple quotes in TOML.print when string contains newline --> - [x] #55141 <!-- Update the aarch64 devdocs to reflect the current state of its support --> - [x] #54955 <!-- don't throw EOFError from sleep --> - [x] #54871 <!-- Make warn missed transformations pass optional --> - [x] #55178 <!-- Compat for `Base.@nospecializeinfer` --> - [x] #55197 <!-- compat notice for a[begin] indexing --> - [x] #54917 <!-- Fix potential underrun with annotation merging --> - [x] #55209 <!-- correction to compat notice for a[begin] --> - [x] #55203 <!-- document mutable struct const fields --> - [x] #54791 <!-- Bump libblastrampoline to v5.10.1 --> - [x] #54950 <!-- SuiteSparse: Bump version --> - [x] #54956 <!-- Fix accidental early evaluation of imported `using` binding --> - [x] #54996 <!-- inference: add missing `MustAlias` widening in `_getfield_tfunc` --> - [x] #55070 <!-- LinearAlgebra: LazyString in error messages for Diagonal/Bidiagonal --> - [x] #54574 <!-- Make ScopedValues public --> - [x] #54739 <!-- finish implementation of upgradable stdlibs --> - [x] #54965 <!-- RFC: Make `include_dependency(path; track_content=true)` the default --> - [x] #53286 <!-- Raise an error when using `include_dependency` with non-existent file or directory --> - [x] #55066 <!-- fix loading of repeated/concurrent modules --> - [x] #52694 <!-- Reinstate similar for AbstractQ for backward compatibility --> - [x] #55218 <!-- Artifacts: use a different way of getting the UUID of a module --> - [x] #54891 <!-- #54739-related fixes for loading stdlibs --> - [x] #55072 <!-- trace-compile: don't generate `precompile` statements for OpaqueClosure methods --> - [x] #55188 <!-- Make Core.TypeofUnion use the type method table --> Need manual backport: - [ ] #55052 <!-- Fix `(l/r)mul!` with `Diagonal`/`Bidiagonal` --> Contains multiple commits, manual intervention needed: Non-merged PRs with backport label: - [ ] #55169 <!-- `propertynames` for SVD respects private argument --> - [ ] #55148 <!-- Random: Mark unexported public symbols as public --> - [ ] #55017 <!-- TOML: Make `Dates` a type parameter --> - [ ] #55013 <!-- [docs] change docstring to match code --> - [ ] #54919 <!-- Fix annotated join with non-concrete eltype iters --> - [ ] #54457 <!-- Make `String(::Memory)` copy --> - [ ] #53957 <!-- tweak how filtering is done for what packages should be precompiled --> - [ ] #51479 <!-- prevent code loading from lookin in the versioned environment when building Julia --> - [ ] #50813 <!-- More doctests for Sockets and capitalization fix --> - [ ] #50157 <!-- improve docs for `@inbounds` and `Base.@propagate_inbounds` --> - [ ] #41244 <!-- Fix shell `cd` error when working dir has been deleted -->
Backported PRs: - [x] #51351 <!-- Remove boxing in pinv --> - [x] #52678 <!-- Profile: Improve module docstring --> - [x] #54201 <!-- Fix generic triangular solves with empty matrices --> - [x] #54605 <!-- Allow libquadmath to also fail as it is not available on all systems --> - [x] #54634 <!-- Fix trampoline assembly for build on clang 18 on apple silicon --> - [x] #54635 <!-- Aggressive constprop in trevc! to stabilize triangular eigvec --> - [x] #54645 <!-- ensure we set the right value to gc_first_tid --> - [x] #54671 <!-- Add boundscheck in bindingkey_eq to avoid OOB access due to data race --> - [x] #54672 <!-- make: Fix `sed` command for LLVM libraries with no symbol versioning --> - [x] #54704 <!-- LazyString in reinterpretarray error messages --> - [x] #54713 <!-- make: use `readelf` for LLVM symbol version detection --> - [x] #54781 <!-- [LinearAlgebra] Improve resilience to unknown libblastrampoline flags --> - [x] #54837 <!-- Do not add type tag size to the `alloc_typed` lowering for GC allocations --> - [x] #54815 <!-- add sticky task warning to `@task` and `schedule` --> - [x] #55141 <!-- Update the aarch64 devdocs to reflect the current state of its support --> - [x] #55178 <!-- Compat for `Base.@nospecializeinfer` --> - [x] #55197 <!-- compat notice for a[begin] indexing --> - [x] #55209 <!-- correction to compat notice for a[begin] --> - [x] #55203 <!-- document mutable struct const fields --> - [x] #54769 <!-- add missing compat entry to edit --> - [x] #54791 <!-- Bump libblastrampoline to v5.10.1 --> - [x] #55070 <!-- LinearAlgebra: LazyString in error messages for Diagonal/Bidiagonal --> - [x] #54624 <!-- more precise aliasing checks for SubArray --> - [x] #54690 <!-- Fix assertion/crash when optimizing function with dead basic block --> - [x] #55084 <!-- Use triple quotes in TOML.print when string contains newline --> Need manual backport: - [ ] #52505 <!-- fix alignment of emit_unbox_store copy --> - [ ] #53373 <!-- fix sysimage-native-code=no option with pkgimages --> - [ ] #53984 <!-- Profile: fix heap snapshot is valid char check --> - [ ] #54276 <!-- Fix solve for complex `Hermitian` with non-vanishing imaginary part on diagonal --> - [ ] #54669 <!-- Improve error message in inplace transpose --> - [ ] #54871 <!-- Make warn missed transformations pass optional --> Contains multiple commits, manual intervention needed: - [ ] #52854 <!-- Change to streaming out the heap snapshot data --> - [ ] #53218 <!-- Fix interpreter_exec.jl test --> - [ ] #53833 <!-- Profile: make heap snapshots viewable in vscode viewer --> - [ ] #54303 <!-- LinearAlgebra: improve type-inference in Symmetric/Hermitian matmul --> - [ ] #52694 <!-- Reinstate similar for AbstractQ for backward compatibility --> - [ ] #54737 <!-- LazyString in interpolated error messages involving types --> - [ ] #54738 <!-- serialization: fix relocatability bug --> - [ ] #55052 <!-- Fix `(l/r)mul!` with `Diagonal`/`Bidiagonal` --> Non-merged PRs with backport label: - [ ] #55220 <!-- `isfile_casesensitive` fixes on Windows --> - [ ] #55169 <!-- `propertynames` for SVD respects private argument --> - [ ] #55013 <!-- [docs] change docstring to match code --> - [ ] #51479 <!-- prevent code loading from lookin in the versioned environment when building Julia --> - [ ] #50813 <!-- More doctests for Sockets and capitalization fix --> - [ ] #50157 <!-- improve docs for `@inbounds` and `Base.@propagate_inbounds` --> - [ ] #41244 <!-- Fix shell `cd` error when working dir has been deleted -->
…5359) This should hopefully fix the failing tests. Co-authored-by: Kristoffer Carlsson <[email protected]>
Backported PRs: - [x] #54962 <!-- Add timing to precompile trace compile --> - [x] #55180 <!-- compress jit debuginfo for easy memory savings --> - [x] #54919 <!-- Fix annotated join with non-concrete eltype iters --> - [x] #55013 <!-- [docs] change docstring to match code --> - [x] #55017 <!-- TOML: Make `Dates` a type parameter --> - [x] #54033 <!-- Fix a bug in `stack`'s DimensionMismatch error message --> - [x] #55242 <!-- fix at-main docstring to not code quote a compat box --> - [x] #55261 <!-- Make `jl_*affinity` tests more portable --> - [x] #54736 <!-- specificity: ensure fast-path in `sub/eq_msp` handle missing `UnionAll` wrapper correctly. --> - [x] #55299 <!-- typeintersect: fix bounds merging during inner `intersect_all`. --> - [x] #55302 <!-- Add `lbt_forwarded_funcs()` to debug LBT forwarding issues --> - [x] #55148 <!-- Random: Mark unexported public symbols as public --> - [x] #55303 <!-- avoid overflowing show for OffsetArrays around typemax --> - [x] #55317 <!-- Restrict argument to `isleapyear(::Integer)` --> - [x] #55327 <!-- Profile: Fix stdlib paths --> - [x] #55330 <!-- [libblastrampoline] Bump to v5.11.0 --> - [x] #55310 <!-- Preserve structure in scaling triangular matrices by NaN --> - [x] #55329 <!-- mapreduce: don't inbounds unknown functions --> - [x] #55356 <!-- Profile: close files when assembling heap snapshot --> - [x] #55371 <!-- Fix tr for block SymTridiagonal --> - [x] #55307 <!-- Make REPL.TerminalMenus public --> - [x] #55362 <!-- inference: fix missing LimitedAccuracy markers --> - [x] #55306 <!-- AllocOpt: Fix stack lowering where alloca continas boxed and unboxed data --> - [x] #55395 <!-- fix #55389: type-unstable `join` --> - [x] #55226 <!-- re-add `unsafe_convert` for Reinterpret and Reshaped array --> - [x] #55405 <!-- handle unbound vars in NTuple fields --> - [x] #55365 <!-- ml-matches: ensure all methods are included --> - [x] #55428 <!-- codegen: move undef freeze before promotion point --> - [x] #55419 <!-- `stale_cachefile`: handle if the expected cache file is missing --> - [x] #55470 <!-- Add push! implementation for AbstractArray depending only on resize! --> - [x] #55483 <!-- fix hierarchy level of "API reference" in `Dates` documentation --> - [x] #55268 <!-- simplify complex atanh and remove singularity perturbation --> - [x] #55441 <!-- fix Event to use normal Condition variable --> - [x] #55413 <!-- subtyping: fast path for lhs union and rhs typevar --> - [x] #55492 <!-- build: add missing dependencies for expmap --> - [x] #55507 <!-- Fix fast getptls ccall lowering. --> - [x] #55424 <!-- add missing clamp function for IOBuffer --> - [x] #55504 <!-- Update symmetric docstring to reflect the type of uplo --> - [x] #55107 <!-- Make the memory GEP an inbounds GEP since the bounds check has happened somewhere else --> - [x] #55411 <!-- Vendor the terminfo database for use with base/terminfo.jl --> - [x] #55452 <!-- Do not load `ScopedValues` with `using` --> - [x] #55407 <!-- Remove deprecated non string API for LLVM pass pipeline and parse all options --> - [x] #55461 <!-- 🤖 [master] Bump the StyledStrings stdlib from d7496d2 to f6035eb --> - [x] #55433 <!-- Backport #55407 to 1.11 --> - [x] #55225 <!-- [1.11 backport] trace-compile: don't generate `precompile` statements for OpaqueClosure methods (#55072) --> - [x] #55212 <!-- Make `Base.depwarn()` public --> - [x] #552 - [x] #55052 <!-- Fix `(l/r)mul!` with `Diagonal`/`Bidiagonal` --> - [x] #55251 <!-- Restrict binary ops for Diagonal and Symmetric to Number eltypes -->95 <!-- LAPACK: Aggressive constprop to concretely infer syev!/syevd! --> - [x] #55522 <!-- Fix tr for Symmetric/Hermitian block matrices --> Need manual backport: - [x] #55342 <!-- Ensure bidiagonal setindex! does not read indices in error message --> Contains multiple commits, manual intervention needed: - [ ] #55336 <!-- codegen: take gc roots (and alloca alignment) more seriously --> Non-merged PRs with backport label: - [ ] #55506 <!-- Fix indexing in _mapreducedim for OffsetArrays --> - [ ] #55500 <!-- make jl_thread_suspend_and_get_state safe --> - [ ] #55499 <!-- propagate the terminal's `displaysize` to the `IOContext` used by the REPL --> - [ ] #55458 <!-- Allow for generically extracting unannotated string --> - [ ] #55457 <!-- Make AnnotateChar equality consider annotations --> - [ ] #55453 <!-- Privatise the annotations API, for StyledStrings --> - [ ] #55443 <!-- Add test for upper/lower/titlecase and fix call --> - [ ] #55355 <!-- relocation: account for trailing path separator in depot paths --> - [ ] #55220 <!-- `isfile_casesensitive` fixes on Windows --> - [ ] #55169 <!-- `propertynames` for SVD respects private argument --> - [ ] #54457 <!-- Make `String(::Memory)` copy --> - [ ] #53957 <!-- tweak how filtering is done for what packages should be precompiled --> - [ ] #51479 <!-- prevent code loading from lookin in the versioned environment when building Julia --> - [ ] #50813 <!-- More doctests for Sockets and capitalization fix --> - [ ] #50157 <!-- improve docs for `@inbounds` and `Base.@propagate_inbounds` --> - [ ] #41244 <!-- Fix shell `cd` error when working dir has been deleted -->
Backported PRs: - [x] #50832 <!-- Subtype: bug fix for bounds with deeper covariant var --> - [x] #51782 <!-- Fix remove-addrspaces pass in the presence of globals with addrspaces --> - [x] #55720 <!-- Fix `pkgdir` for extensions --> - [x] #55773 <!-- Add compat entry for `Base.donotdelete` --> - [x] #55886 <!-- irrationals: restrict assume effects annotations to known types --> - [x] #55867 <!-- update `hash` doc string: `widen` not required any more --> - [x] #56148 <!-- Make loading work when stdlib deps are missing in the manifest --> - [x] #55870 <!-- fix infinite recursion in `promote_type` for `Irrational` --> - [x] #56252 <!-- REPL: fix brace detection when ' is used for transpose --> - [x] #56264 <!-- inference: fix inference error from constructing invalid `TypeVar` --> - [x] #56276 <!-- move time_imports and trace_* macros to Base but remain owned by InteractiveUtils --> - [x] #56254 <!-- REPL: don't complete str and cmd macros when the input matches the internal name like `r_` to `r"` --> - [x] #56280 <!-- Fix trampoline warning on x86 as well --> - [x] #56304 <!-- typeintersect: more fastpath to skip intersect under circular env --> - [x] #56306 <!-- InteractiveUtils.jl: fixes issue where subtypes resolves bindings and causes deprecation warnings --> - [x] #42080 <!-- recommend explicit `using Foo: Foo, ...` in package code (was: "using considered harmful") --> - [x] #56441 <!-- Profile: mention `kill -s SIGUSR1 julia_pid` for Linux --> - [x] #56511 <!-- The `info` in LAPACK calls should be a Ref instead of a Ptr --> - [x] #55052 <!-- Fix `(l/r)mul!` with `Diagonal`/`Bidiagonal` --> - [x] #52694 <!-- Reinstate similar for AbstractQ for backward compatibility -->
Currently,
rmul!(A::AbstractMatirx, D::Diagonal)
callsmul!(A, A, D)
, but this isn't a valid call, asmul!
assumes no aliasing between the destination and the matrices to be multiplied. As a consequence,This is clearly nonsense, and happens because the internal
_mul!
function assumes that it can safely overwrite the destination with zeros before carrying out the multiplication. This is fixed in this PR by using broadcasting instead. The current implementation is generally equally performant, albeit occasionally with a minor allocation arising fromreshape
ing anArray
.A similar problem also exists in
l/rmul!
withBidiaognal
, but that's a little harder to fix while remaining equally performant.