Feature/view to simd #1190

rrahn · 2019-07-15T17:39:59Z

Implements the to_simd view which does AoS to SoA transformation:

-----------------------------------------------------------------------------------------------------------------
Benchmark                                                       Time             CPU   Iterations UserCounters...
-----------------------------------------------------------------------------------------------------------------
// Naive implementation
to_simd_naive<std::vector<dna4>, simd_type_t<int8_t>>        7223 ns         7196 ns        86953 value=65.3887M
to_simd_naive<std::vector<dna4>, simd_type_t<int16_t>>       1938 ns         1934 ns       371708 value=279.524M
to_simd_naive<std::vector<dna4>, simd_type_t<int32_t>>       1078 ns         1074 ns       646675 value=486.3M
to_simd_naive<std::vector<dna4>, simd_type_t<int64_t>>        673 ns          671 ns       948484 value=713.26M
to_simd_naive<std::deque<dna4>, simd_type_t<int8_t>>        24870 ns        24838 ns        28445 value=21.3906M
to_simd_naive<std::deque<dna4>, simd_type_t<int16_t>>        7647 ns         7594 ns        90960 value=68.4019M
to_simd_naive<std::deque<dna4>, simd_type_t<int32_t>>        2087 ns         2083 ns       332964 value=250.389M
to_simd_naive<std::deque<dna4>, simd_type_t<int64_t>>        1140 ns         1138 ns       619266 value=465.688M

// View implementation
to_simd<std::vector<dna4>, simd_type_t<int8_t>>              2326 ns         2320 ns       305367 value=229.636M
to_simd<std::vector<dna4>, simd_type_t<int16_t>>             1462 ns         1459 ns       479160 value=360.328M
to_simd<std::vector<dna4>, simd_type_t<int32_t>>              638 ns          637 ns      1075946 value=809.111M
to_simd<std::vector<dna4>, simd_type_t<int64_t>>              348 ns          347 ns      2019136 value=1.51839G
to_simd<std::deque<dna4>, simd_type_t<int8_t>>               9592 ns         9568 ns        72257 value=54.3373M
to_simd<std::deque<dna4>, simd_type_t<int16_t>>              5360 ns         5351 ns       128215 value=96.4177M
to_simd<std::deque<dna4>, simd_type_t<int32_t>>              1573 ns         1568 ns       459113 value=345.253M
to_simd<std::deque<dna4>, simd_type_t<int64_t>>               786 ns          784 ns       774791 value=582.643M
to_simd<std::list<dna4>, simd_type_t<int8_t>>               13080 ns        13053 ns        53440 value=40.1869M
to_simd<std::list<dna4>, simd_type_t<int16_t>>               4550 ns         4539 ns       152027 value=114.324M
to_simd<std::list<dna4>, simd_type_t<int32_t>>               1053 ns         1049 ns       668098 value=502.41M
to_simd<std::list<dna4>, simd_type_t<int64_t>>                847 ns          845 ns       934093 value=702.438M

codecov · 2019-07-15T22:29:48Z

Codecov Report

Merging #1190 into master will decrease coverage by 0.07%.
The diff coverage is 91.93%.

@@            Coverage Diff             @@
##           master    #1190      +/-   ##
==========================================
- Coverage   96.86%   96.79%   -0.08%     
==========================================
  Files         212      213       +1     
  Lines        8274     8403     +129     
==========================================
+ Hits         8015     8134     +119     
- Misses        259      269      +10

Impacted Files	Coverage Δ
include/seqan3/core/simd/simd_algorithm.hpp	`100% <100%> (ø)`	⬆️
include/seqan3/core/simd/view_to_simd.hpp	`90.47% <90.47%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5a0354a...86a55af. Read the comment docs.

rrahn · 2019-07-18T07:59:24Z

@marehr ping

marehr

Review until now

include/seqan3/core/simd/detail/builtin_simd.hpp

marehr · 2019-07-18T15:38:29Z

include/seqan3/core/simd/concept.hpp

@@ -29,36 +29,36 @@ namespace seqan3::detail
 //     error: invalid use of incomplete type ‘struct incomplete::template_type<int>’
 //          requires std::Same<decltype(a - b), simd_t>;
 template <typename simd_t>
-SEQAN3_CONCEPT Simd = requires (simd_t a, simd_t b)
+SEQAN3_CONCEPT Simd = requires (std::remove_reference_t<simd_t> a, std::remove_reference_t<simd_t> b)


a and b should be without std::remove_reference_t<> because the operations (like a == b, a != b) should be still valid for reference types. And from a user perspective, he should be able to expect that an expression a == b is working for the given type.

Suggested change

SEQAN3_CONCEPT Simd = requires (std::remove_reference_t<simd_t> a, std::remove_reference_t<simd_t> b)

SEQAN3_CONCEPT Simd = requires (simd_t a, simd_t b)

The rest of the std::remove_reference_t<> in this concept are okay.

On a side note what happens with const simds? Should we introduce a Writable/MutableSimd concept?

I thought about it and honestly, I think we need to distinguish between them. Also for many operations we seldomly update a vector but always get a new one returned. That's at least my feeling how we use it in the algorithms.

include/seqan3/core/simd/concept.hpp

include/seqan3/core/simd/detail/builtin_simd.hpp

marehr · 2019-07-19T12:07:23Z

include/seqan3/core/simd/detail/builtin_simd.hpp

+ * \see seqan3::detail::is_native_builtin_simd_v
+ */
+template <typename builtin_simd_t>
+constexpr bool is_native_builtin_simd_v = is_native_builtin_simd<builtin_simd_t>::value;


why not evaluate it here as a lambda function? that would be more readable.

We can also just define the bool constants if we don't need it as a type anyway. Otherwise it follows the STL way of providing unary type traits.

include/seqan3/core/simd/view_to_simd.hpp

marehr · 2019-08-07T13:29:05Z

include/seqan3/core/simd/view_to_simd.hpp

+                    return this_view->padding_value;
+                }
+                else
+                {  // only increment if not at end.


Suggested change

{ // only increment if not at end.

{ // only increment if not at end.

marehr · 2019-08-07T13:32:23Z

include/seqan3/core/simd/view_to_simd.hpp

+            // Thus, for the 8 sequences we need to load two times 16 consecutive bytes to fill the matrix.
+            // This quadratic byte matrix can be transposed efficiently with simd instructions.
+            constexpr int8_t max_size = simd_traits<max_simd_t>::length;
+            constexpr int8_t num_chunks = max_size / chunk_size;


This you called chunks_per_load

Suggested change

constexpr int8_t num_chunks = max_size / chunk_size;

constexpr int8_t num_chunks = chunks_per_load;

marehr · 2019-08-07T13:33:17Z

include/seqan3/core/simd/view_to_simd.hpp

+            // To fill the 16x16 matrix we need four 8x8 matrices.
+            // Thus, for the 8 sequences we need to load two times 16 consecutive bytes to fill the matrix.
+            // This quadratic byte matrix can be transposed efficiently with simd instructions.
+            constexpr int8_t max_size = simd_traits<max_simd_t>::length;


Suggested change

constexpr int8_t max_size = simd_traits<max_simd_t>::length;

constexpr int8_t max_size = simd_traits<simd_t>::max_length;

rrahn · 2019-08-08T11:58:22Z

@marehr ok, I think I addressed all your issues so far. Ready for the next ones 😏

marehr

puh I think the high-level design seems fine, but under the hood it is pretty messy

include/seqan3/core/simd/simd_algorithm.hpp

marehr · 2019-08-11T08:53:04Z

include/seqan3/core/simd/simd_algorithm.hpp

+    {
+        detail::transpose_matrix_sse4(matrix);
+    }
+    else // Element wise transpose matrix which is possibly auto vectorised.


I imagine you didn't test that

I tested everyhing! For SSE4, AVX2 and AVX512 and no extension at all.

I meant that it auto vectorises.

yes I tested with the auto vectorisation and in fact the intrinsics version was roughly 20% faster. So I decided to add it, but kept the auto vectorisation for larger instruction sets available for now.

marehr · 2019-08-11T21:15:11Z

include/seqan3/core/simd/simd_algorithm.hpp

+template <Simd target_simd_t, Simd source_simd_t>
+constexpr target_simd_t upcast_signed(source_simd_t const & src)
+{
+    if constexpr (simd_traits<source_simd_t>::max_length == 16) // SSE4


This works for now, but I'm not really a fan of the current design. It does not check wether current architecture really supports sse4, avx2 and avx512.

It will ungracefully fail if you create a simd vector that has avx512 size, but the architecture does not include avx512.

Well, I hope you don't mind, that I really don't care about corner cases right now. We don't even have a proper testing system for this right now. Not sure, if you plan to add these sometime soon, but it was already quite a bit of work to test everything properly manually. In general the whole design can/should be adapted to the SIMD proposal but this is not yet relevant. We can make it safe once we have the algorithms.

include/seqan3/core/simd/view_to_simd.hpp

marehr · 2019-08-11T21:33:15Z

test/snippet/core/simd/view_to_simd.cpp

+        debug_stream << "\n\n";
+    }
+    return 0;
+}


no output? it would be helpful to provide output.

You mean a file containing the output? Or a comment with the output?

Either would be fine

include/seqan3/core/simd/view_to_simd.hpp

marehr · 2019-08-11T23:26:09Z

include/seqan3/core/simd/view_to_simd.hpp

+                {
+                    auto & it = cached_iter[i];
+                    max_simd_type & tmp = matrix[pos];
+                    tmp = simd::fill<max_simd_type>(~0);


why no fill it here with the padding value? and omit the ~0 semantic

because the padding value is based on the scalar type of the target vector size which might be bigger than one byte.

include/seqan3/core/simd/view_to_simd.hpp

rrahn · 2019-08-14T12:32:37Z

@marehr I either added all your requests or answered your comments.

marehr · 2019-08-14T12:37:24Z

Thank you I have a (second) look :)

marehr

llstm

…orithms.

rrahn · 2019-08-20T12:51:21Z

@marehr I know you already agreed upon everything, but I applied 99% of your suggestions. Maybe you want to still have a look?

rrahn requested a review from marehr July 15, 2019 17:39

rrahn force-pushed the feature/view_to_simd branch 2 times, most recently from 44d82c9 to 66b1684 Compare July 15, 2019 22:29

rrahn force-pushed the feature/view_to_simd branch from 66b1684 to c6de8be Compare July 16, 2019 15:33

marehr reviewed Aug 7, 2019

View reviewed changes

rrahn force-pushed the feature/view_to_simd branch 3 times, most recently from a1eee15 to 0328b07 Compare August 8, 2019 11:57

rrahn mentioned this pull request Aug 8, 2019

Feature/simd load #1018

Closed

rrahn requested a review from marehr August 8, 2019 14:15

marehr reviewed Aug 11, 2019

View reviewed changes

rrahn commented Aug 12, 2019

View reviewed changes

include/seqan3/core/simd/view_to_simd.hpp Show resolved Hide resolved

rrahn added 3 commits August 14, 2019 14:26

[MISC] Adds helper to check if builtin type is native.

5294f43

[MISC] Adds alias for ranges all_of and max_element.

4fb358f

[MISC] Refines SIMD Concept to work with reference types as well.

3a2ec83

rrahn force-pushed the feature/view_to_simd branch from 0328b07 to e1ef5f5 Compare August 14, 2019 12:31

rrahn requested a review from marehr August 14, 2019 12:31

marehr approved these changes Aug 15, 2019

View reviewed changes

rrahn added 3 commits August 20, 2019 14:49

[FEATURE] Adds some simd functions.

75b72d0

[FEATURE] Adds to_simd_view

8c3bdcd

[TEST] Adds performance benchmarks for to_simd_view and some simd alg…

86a55af

…orithms.

rrahn force-pushed the feature/view_to_simd branch from e1ef5f5 to 86a55af Compare August 20, 2019 12:50

rrahn merged commit 6f15cc5 into seqan:master Aug 21, 2019

rrahn deleted the feature/view_to_simd branch September 3, 2019 07:53

omnisip mentioned this pull request Dec 15, 2020

#372 Integer Sign/Zero Extension for {8,16}->{32,64} WebAssembly/simd#395

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/view to simd #1190

Feature/view to simd #1190

rrahn commented Jul 15, 2019

codecov bot commented Jul 15, 2019 •

edited

Loading

rrahn commented Jul 18, 2019

marehr left a comment

marehr Jul 18, 2019

rrahn Aug 7, 2019

marehr Jul 19, 2019

rrahn Aug 7, 2019

marehr Aug 7, 2019

marehr Aug 7, 2019

marehr Aug 7, 2019

rrahn commented Aug 8, 2019

marehr left a comment

marehr Aug 11, 2019

rrahn Aug 13, 2019

marehr Aug 15, 2019

rrahn Aug 19, 2019

marehr Aug 11, 2019

rrahn Aug 13, 2019

marehr Aug 11, 2019

rrahn Aug 14, 2019

marehr Aug 15, 2019

marehr Aug 11, 2019

rrahn Aug 13, 2019

rrahn commented Aug 14, 2019

marehr commented Aug 14, 2019

marehr left a comment

rrahn commented Aug 20, 2019

	SEQAN3_CONCEPT Simd = requires (std::remove_reference_t<simd_t> a, std::remove_reference_t<simd_t> b)
	SEQAN3_CONCEPT Simd = requires (simd_t a, simd_t b)

	{ // only increment if not at end.
	{ // only increment if not at end.

	constexpr int8_t num_chunks = max_size / chunk_size;
	constexpr int8_t num_chunks = chunks_per_load;

	constexpr int8_t max_size = simd_traits<max_simd_t>::length;
	constexpr int8_t max_size = simd_traits<simd_t>::max_length;

Feature/view to simd #1190

Feature/view to simd #1190

Conversation

rrahn commented Jul 15, 2019

codecov bot commented Jul 15, 2019 • edited Loading

Codecov Report

rrahn commented Jul 18, 2019

marehr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rrahn commented Aug 8, 2019

marehr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rrahn commented Aug 14, 2019

marehr commented Aug 14, 2019

marehr left a comment

Choose a reason for hiding this comment

rrahn commented Aug 20, 2019

codecov bot commented Jul 15, 2019 •

edited

Loading