Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal Parquet panic when using a Map type. #1619

Closed
cmackenzie1 opened this issue Sep 8, 2023 · 0 comments · Fixed by #1620
Closed

Internal Parquet panic when using a Map type. #1619

cmackenzie1 opened this issue Sep 8, 2023 · 0 comments · Fixed by #1620
Labels
bug Something isn't working

Comments

@cmackenzie1
Copy link
Contributor

cmackenzie1 commented Sep 8, 2023

Environment

Delta-rs version: 0.14

Binding: rust

Environment:

  • Cloud provider:
  • OS:
  • Other:

Bug

What happened:

Writing JSON records with a Map(String, String) fails panics during parquet flush and close.

thread 'local::test_issue_parquet_bit_util' panicked at 'assertion failed: `(left == right)`
  left: `1`,
 right: `0`', /Users/cole/.cargo/registry/src/index.crates.io-6f17d22bba15001f/parquet-45.0.0/src/util/bit_util.rs:272:9

What you expected to happen:

Not to panic and error returned if applicable.

How to reproduce it:

let fields: Vec<SchemaField> = vec![SchemaField::new(
    "metadata".to_string(),
    SchemaDataType::map(SchemaTypeMap::new(
        Box::new(SchemaDataType::primitive("string".to_string())),
        Box::new(SchemaDataType::primitive("string".to_string())),
        true,
    )),
    true,
    HashMap::new(),
)];
let schema = deltalake::Schema::new(fields);
let table = deltalake::DeltaTableBuilder::from_uri("./tests/data/map-null").build()?;
let _ = DeltaOps::from(table)
    .create()
    .with_columns(schema.get_fields().to_owned())
    .await?;

let mut table = deltalake::open_table("./tests/data/map-null").await?;

let mut writer = JsonWriter::for_table(&table).unwrap();
let _ = writer
    .write(vec![
        serde_json::json!({"metadata": {"hello": "world", "something": null}}),
    ])
    .await
    .unwrap();
writer.flush_and_commit(&mut table).await.unwrap();

More details:

This value is true https://github.com/delta-io/delta-rs/blob/main/rust/src/delta_arrow.rs#L170C21-L170C25 , where arrow defines it as false https://github.com/apache/arrow-rs/blob/master/arrow-schema/src/field.rs#L230. This is also described in apache/arrow-rs#1697.

@cmackenzie1 cmackenzie1 added the bug Something isn't working label Sep 8, 2023
wjones127 added a commit that referenced this issue Sep 11, 2023
# Description

This value was true but where arrow defines it as always false
https://github.com/apache/arrow-rs/blob/master/arrow-schema/src/field.rs#L230.

This is also described in apache/arrow-rs#1697.

This also replaces `key_value` as the struct name with `entries` to
remain consistent with
https://github.com/apache/arrow-rs/blob/878217b9e330b4f1ed13e798a214ea11fbeb2bbb/arrow-schema/src/datatype.rs#L247-L250

The description of the main changes of your pull request

# Related Issue(s)

- closes #1619 

# Documentation

<!---
Share links to useful documentation
--->

---------

Co-authored-by: Will Jones <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant