Considering migrate to portable-simd #580

sundy-li · 2021-11-06T10:24:27Z

Seems packed_simd2 is not developed.

https://github.com/rust-lang/portable-simd/

jorgecarleitao · 2021-11-10T16:58:02Z

Asked for some guidance over the zulip channel: https://rust-lang.zulipchat.com/#narrow/stream/257879-project-portable-simd/topic/is.20it.20mature.20to.20switch.20to.20portable-simd.3F/near/261018409

jorgecarleitao · 2021-11-10T17:27:27Z

IMO this feels like a super cool issue. If anyone would like to pick this one, go ahead.

jorgecarleitao · 2021-11-10T20:40:00Z

fwiw, quick perf benchmark shows benefits on the sum of non-nulls:

core_simd_sum 2^20 f32     time:   [174.18 us 174.37 us 174.59 us]
packed_simd_sum 2^20 f32   time:   [183.91 us 184.10 us 184.33 us]
nonsimd_sum 2^20 f32       time:   [193.12 us 194.76 us 197.03 us]
naive_sum null 2^20 f32    time:   [1.6468 ms 1.6513 ms 1.6555 ms]

with

#![feature(portable_simd)]

use criterion::{criterion_group, criterion_main, Criterion};

use std::convert::TryInto;

use core_simd::f32x16;
use packed_simd::f32x16 as p_f32x16;

const LANES: usize = 16;

pub fn packed_simd_sum(values: &[f32]) -> f32 {
    let chunks = values.chunks_exact(LANES);
    let remainder = chunks.remainder();

    let sum = chunks.fold(p_f32x16::default(), |acc, chunk| {
        let chunk: [f32; 16] = chunk.try_into().unwrap();
        let chunk: p_f32x16 = p_f32x16::from_slice_unaligned(&chunk);

        acc + chunk
    });

    let remainder: f32 = remainder.iter().copied().sum();

    sum.sum() + remainder
}

pub fn core_simd_sum(values: &[f32]) -> f32 {
    let chunks = values.chunks_exact(LANES);
    let remainder = chunks.remainder();

    let sum = chunks.fold(f32x16::default(), |acc, chunk| {
        let chunk: [f32; 16] = chunk.try_into().unwrap();
        let chunk: f32x16 = f32x16::from_array(chunk);

        acc + chunk
    });

    let remainder: f32 = remainder.iter().copied().sum();

    let mut reduced = 0.0f32;
    for i in 0..LANES {
        reduced += sum[i];
    }
    reduced + remainder
}

pub fn nonsimd_sum(values: &[f32]) -> f32 {
    let chunks = values.chunks_exact(LANES);
    let remainder = chunks.remainder();

    let sum = chunks.fold([0.0f32; LANES], |mut acc, chunk| {
        let chunk: [f32; LANES] = chunk.try_into().unwrap();
        for i in 0..LANES {
            acc[i] += chunk[i];
        }
        acc
    });

    let remainder: f32 = remainder.iter().copied().sum();

    let mut reduced = 0.0f32;
    (0..LANES).for_each(|i| {
        reduced += sum[i];
    });
    reduced + remainder
}

pub fn naive_sum(values: &[f32]) -> f32 {
    values.iter().sum()
}

fn add_benchmark(c: &mut Criterion) {
    (10..=20).step_by(2).for_each(|log2_size| {
        let size = 2usize.pow(log2_size);
        let array = (0..size)
            .map(|x| std::f32::consts::PI * x as f32 * x as f32 - std::f32::consts::PI * x as f32)
            .collect::<Vec<_>>();

        c.bench_function(&format!("core_simd_sum 2^{} f32", log2_size), |b| {
            b.iter(|| core_simd_sum(&array))
        });
        c.bench_function(&format!("packed_simd_sum 2^{} f32", log2_size), |b| {
            b.iter(|| packed_simd_sum(&array))
        });
        c.bench_function(&format!("nonsimd_sum 2^{} f32", log2_size), |b| {
            b.iter(|| nonsimd_sum(&array))
        });
        c.bench_function(&format!("naive_sum null 2^{} f32", log2_size), |b| {
            b.iter(|| naive_sum(&array))
        });
    });
}

criterion_group!(benches, add_benchmark);
criterion_main!(benches);

and

[package]
name = "test"
version = "0.1.0"
edition = "2018"

[dependencies]
core_simd = { git = "https://github.com/rust-lang/portable-simd" }
packed_simd = { version = "0.3", package = "packed_simd_2" }

[dev-dependencies]
criterion = "0.3"

[[bench]]
name = "sum"
harness = false

jorgecarleitao · 2021-11-11T05:52:06Z

See also https://github.com/DataEngineeringLabs/simd-benches, where I am benchmarking the algorithms.

Dandandan · 2021-11-11T12:02:33Z

A cool thing is support for gather operations, which could speed up take.
Reference:
https://rust-lang.github.io/portable-simd/core_simd/simd/struct.Simd.html#method.gather_or

jorgecarleitao · 2021-11-22T05:19:59Z

Waiting for rust-lang/portable-simd#197

Igosuki · 2022-01-12T09:56:46Z

Nota bene, this prevents from compiling datafusion on stable with simd enabled when using arrow2.
Edit : scratch that, as simd is only available on nightly to begin with.

ritchie46 · 2022-01-12T10:44:23Z

Nota bene, this prevents from compiling datafusion on stable with simd enabled when using arrow2.

Maybe we can have two simd implementations separated by feature flags?

Igosuki · 2022-01-12T12:13:11Z

Features bound to which rustc make things very clunky... I think we'll have to limit datafusion on arrow2 to rust nightly for now.

jorgecarleitao · 2022-01-12T15:28:55Z

does datafusion compile on stable with simd enabled? - I think it depends on arrow, which depends on packed_simd, which requires nighty (but it has been a while)

My understanding is that currently simd in our whole stack (arrow, arrow2, datafusion, polars, databend, etc) is only available on nightly. AFAI understand this is one of the issues the simd working group is addressing with the std::simd - make simd available on stable.

Igosuki · 2022-01-12T15:47:25Z

My bad, it is in fact only available on nightly.

jorgecarleitao added investigation Issues or PRs that are investigations. Prs may or may not be merged. help wanted Extra attention is needed labels Nov 10, 2021

This was referenced Jan 9, 2022

Migrated to portable simd #747

Merged

Discussion: Switch DataFusion to using arrow2? apache/datafusion#1532

Closed

jorgecarleitao closed this as completed in #747 Mar 5, 2022

jorgecarleitao added no-changelog Issues whose changes are covered by a PR and thus should not be shown in the changelog and removed help wanted Extra attention is needed labels Mar 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Considering migrate to portable-simd #580

Considering migrate to portable-simd #580

sundy-li commented Nov 6, 2021 •

edited

Loading

jorgecarleitao commented Nov 10, 2021

jorgecarleitao commented Nov 10, 2021

jorgecarleitao commented Nov 10, 2021

jorgecarleitao commented Nov 11, 2021

Dandandan commented Nov 11, 2021

jorgecarleitao commented Nov 22, 2021

Igosuki commented Jan 12, 2022 •

edited

Loading

ritchie46 commented Jan 12, 2022

Igosuki commented Jan 12, 2022

jorgecarleitao commented Jan 12, 2022

Igosuki commented Jan 12, 2022

Considering migrate to portable-simd #580

Considering migrate to portable-simd #580

Comments

sundy-li commented Nov 6, 2021 • edited Loading

jorgecarleitao commented Nov 10, 2021

jorgecarleitao commented Nov 10, 2021

jorgecarleitao commented Nov 10, 2021

jorgecarleitao commented Nov 11, 2021

Dandandan commented Nov 11, 2021

jorgecarleitao commented Nov 22, 2021

Igosuki commented Jan 12, 2022 • edited Loading

ritchie46 commented Jan 12, 2022

Igosuki commented Jan 12, 2022

jorgecarleitao commented Jan 12, 2022

Igosuki commented Jan 12, 2022

sundy-li commented Nov 6, 2021 •

edited

Loading

Igosuki commented Jan 12, 2022 •

edited

Loading