You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for reporting this. The issue here is that all tensor-contraction-like operations are going through the same interface (module.py:_contract), then inside of deferred.py:contract we look for cases that allow us to use the "fast" paths, for vector-vector, matrix-vector and matrix-matrix multiplication. The eager implementation of contract doesn't do this optimization, and always just falls back to np.einsum, which apparently runs much slower than np.matmul, even if the einsum expression corresponds to a matrix-matrix multiplication.
We should handle "fast path" operations like dot, matmul etc. directly in module.py and array.py. We still want to keep the "check for fast paths" in deferred.py:contract, so we still recognize cases where an einsum contraction description can be executed efficiently, but when we know in module.py or array.py that we're in a fast path, we should just emit the corresponding task directly.
manopapad
changed the title
Eager arrays 10x slower than Numpy
Special-case mm, mv and vv contractions in eager path
Feb 14, 2024
Software versions
Python : 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0]
Platform : Linux-5.4.0-169-generic-x86_64-with-glibc2.31
Legion : v23.11.00.dev-54-g40f6061
Legate : 23.11.00.dev+54.g40f6061
Cunumeric : 23.11.00.dev+36.gb2912ed7
Numpy : 1.26.4
Scipy : 1.12.0
Numba : 0.59.0
CTK package : cuda-version-11.7-h67201e3_2 (conda-forge)
GPU driver : 535.54.03
GPU devices :
GPU 0: Tesla P100-SXM2-16GB
GPU 1: Tesla P100-SXM2-16GB
GPU 2: Tesla P100-SXM2-16GB
GPU 3: Tesla P100-SXM2-16GB
Jupyter notebook / Jupyter Lab version
No response
Expected behavior
Expected Eager arrays to be closer to NumPy performance.
Observed behavior
Example code or instructions
Stack traceback or browser console output
No response
The text was updated successfully, but these errors were encountered: