Skip to content

Commit

Permalink
Merge pull request #172 from sterrettm2/mmx_fix
Browse files Browse the repository at this point in the history
Fix for MMX instructions being generated without emms
  • Loading branch information
r-devulap authored Nov 12, 2024
2 parents 9ab7d47 + 5c63eec commit d6e0d49
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 10 deletions.
10 changes: 0 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,16 +80,6 @@ benchmark](https://github.com/google/benchmark) frameworks respectively. You
can configure meson to build them both by using `-Dbuild_tests=true` and
`-Dbuild_benchmarks=true`.

### Note about building with avx512 by g++ v9 and v10

There is a risk when compile with avx512 by g++ v9 and v10,
as some `MMX Technology` instructions is used by g++ v9/v10
without clearing fpu state.
Check [issue 154](https://github.com/intel/x86-simd-sort/issues/154)
for more details.

Adding `g++` option `-mno-mmx`, which disables `MMX Technology` instructions, is a possible workaround.

## Example usage

#### Sort an array of floats
Expand Down
10 changes: 10 additions & 0 deletions src/xss-common-argsort.h
Original file line number Diff line number Diff line change
Expand Up @@ -575,6 +575,11 @@ X86_SIMD_SORT_INLINE void xss_argsort(T *arr,

if (descending) { std::reverse(arg, arg + arrsize); }
}

#ifdef __MMX__
// Workaround for compiler bug generating MMX instructions without emms
_mm_empty();
#endif
}

template <typename T>
Expand Down Expand Up @@ -632,6 +637,11 @@ X86_SIMD_SORT_INLINE void xss_argselect(T *arr,
argselect_<vectype, argtype>(
arr, arg, k, 0, arrsize - 1, 2 * (arrsize_t)log2(arrsize));
}

#ifdef __MMX__
// Workaround for compiler bug generating MMX instructions without emms
_mm_empty();
#endif
}

template <typename T>
Expand Down

0 comments on commit d6e0d49

Please sign in to comment.