Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soundness: Creating MapArray from ArrayData does not perform bound checks on reading offsets #806

Closed
jorgecarleitao opened this issue Sep 26, 2021 · 2 comments

Comments

@jorgecarleitao
Copy link
Member

jorgecarleitao commented Sep 26, 2021

Equivalent to #773 for the MapArray.

use std::convert::TryFrom;
use std::sync::Arc;

use arrow::array::*;
use arrow::buffer::*;
use arrow::datatypes::*;

fn main() {
    let data = ArrayData::new(
        DataType::Map(
            Box::new(Field::new(
                "entries",
                DataType::Struct(vec![
                    Field::new("keys", DataType::Utf8, false),
                    Field::new("values", DataType::Utf8, false),
                ]),
                false,
            )),
            false,
        ),
        10,
        None,
        None,
        0,
        vec![Buffer::from_slice_ref(&[0i32, 10])],
        vec![StructArray::try_from(vec![
            (
                "keys",
                Arc::new(StringArray::from(vec!["hello", "", "parquet"])) as ArrayRef,
            ),
            (
                "values",
                Arc::new(StringArray::from(vec!["hello", "", "parquet"])) as ArrayRef,
            ),
        ])
        .unwrap()
        .data_ref()
        .clone()],
    );
    let a = MapArray::from(data);
    let b = a.value(1);
    let b = b.as_any().downcast_ref::<StructArray>().unwrap();
    println!("{:?}", b);
}
error: Undefined Behavior: using uninitialized data, but this operation requires initialized memory
  --> /home/azureuser/projects/arrow-rs/arrow/src/array/array_map.rs:73:19
   |
73 |         let end = self.value_offsets()[i + 1] as usize;
   |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ using uninitialized data, but this operation requires initialized memory
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
           
   = note: inside `arrow::array::MapArray::value` at /home/azureuser/projects/arrow-rs/arrow/src/array/array_map.rs:73:19
note: inside `main` at arrow/examples/unsafe.rs:41:13
  --> arrow/examples/unsafe.rs:41:13
   |
41 |     let b = a.value(1);
   |             ^^^^^^^^^^
   = note: inside `<fn() as std::ops::FnOnce<()>>::call_once - shim(fn())` at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
   = note: inside `std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:125:18
   = note: inside closure at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:63:18
   = note: inside `std::ops::function::impls::<impl std::ops::FnOnce<()> for &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>::call_once` at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:259:13
   = note: inside `std::panicking::r#try::do_call::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:401:40
   = note: inside `std::panicking::r#try::<i32, &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:365:19
   = note: inside `std::panic::catch_unwind::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:434:14
   = note: inside closure at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:45:48
   = note: inside `std::panicking::r#try::do_call::<[closure@std::rt::lang_start_internal::{closure#2}], isize>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:401:40
   = note: inside `std::panicking::r#try::<isize, [closure@std::rt::lang_start_internal::{closure#2}]>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:365:19
   = note: inside `std::panic::catch_unwind::<[closure@std::rt::lang_start_internal::{closure#2}], isize>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:434:14
   = note: inside `std::rt::lang_start_internal` at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:45:20
   = note: inside `std::rt::lang_start::<()>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-09-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:62:5

error: aborting due to previous error
@alamb alamb changed the title Soundness: MapArray does not perform bound checks on reading offsets Soundness: Creating MapArray from ArrayData does not perform bound checks on reading offsets Sep 29, 2021
@alamb
Copy link
Contributor

alamb commented Sep 29, 2021

Updating title of the ticket to make it clear this affects misusing the lower level APIs, as described in https://github.com/apache/arrow-rs/tree/master/arrow#safety

@alamb
Copy link
Contributor

alamb commented Oct 29, 2021

This is a specific case of the general issue described in #817, so closing this one as a duplicate in favor of the more general ticket.

@alamb alamb closed this as completed Oct 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants