Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module data_type is private in Rust Parquet 8.0.0 #1302

Closed
ScottSyms opened this issue Feb 11, 2022 · 3 comments
Closed

module data_type is private in Rust Parquet 8.0.0 #1302

ScottSyms opened this issue Feb 11, 2022 · 3 comments
Labels
bug parquet Changes to the parquet crate

Comments

@ScottSyms
Copy link

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Expected behavior
A clear and concise description of what you expected to happen.

Additional context
Add any other context about the problem here.

@ScottSyms ScottSyms added the bug label Feb 11, 2022
@ScottSyms
Copy link
Author

ScottSyms commented Feb 11, 2022

Executing some code from stack overflow which works with the Rust parquet crate versions earlier than 8.0.0. The error generated by the use parquet::data_type::ByteArray; import suggests the data_type module is not made public. Not sure if this a release change or a bug.

use std::{fs, path::Path, sync::Arc};
use parquet::{column::writer::ColumnWriter, data_type::ByteArray, file::{
    properties::WriterProperties,
    writer::{FileWriter, SerializedFileWriter},
}, schema::parser::parse_message_type};

fn main() {
    let path = Path::new("./sample.parquet");

    let message_type = "
        message schema {
            REQUIRED INT32 b;
            REQUIRED BINARY msg (UTF8);
        }
    ";
    let schema = Arc::new(parse_message_type(message_type).unwrap());
    let props = Arc::new(WriterProperties::builder().build());
    let file = fs::File::create(&path).unwrap();

    let mut rows: i64 = 0;
    let data = vec![
        (10, "A"),
        (20, "B"),
        (30, "C"),
        (40, "D"),
    ];

    let mut writer = SerializedFileWriter::new(file, schema, props).unwrap();
    for (key, value) in data {
        let mut row_group_writer = writer.next_row_group().unwrap();
        let id_writer = row_group_writer.next_column().unwrap();
        if let Some(mut writer) = id_writer {
            match writer {
                ColumnWriter::Int32ColumnWriter(ref mut typed) => {
                    let values = vec![key];
                    rows +=
                        typed.write_batch(&values[..], None, None).unwrap() as i64;
                },
                _ => {
                    unimplemented!();
                }
            }
            row_group_writer.close_column(writer).unwrap();
        }
        let data_writer = row_group_writer.next_column().unwrap();
        if let Some(mut writer) = data_writer {
            match writer {
                ColumnWriter::ByteArrayColumnWriter(ref mut typed) => {
                    let values = ByteArray::from(value);
                    rows += typed.write_batch(&[values], None, None).unwrap() as i64;
                }
                _ => {
                    unimplemented!();
                }
            }
            row_group_writer.close_column(writer).unwrap();
        }
        writer.close_row_group(row_group_writer).unwrap();
    }
    writer.close().unwrap();

    println!("Wrote {}", rows);

}

@alamb
Copy link
Contributor

alamb commented Feb 12, 2022

Hi @ScottSyms -- thanks for the report and sorry for the issue you are encountering

I think this is fixed in arrow 9.0.0 (in #1244 from @tustvold ), which is due to be released later today or tomorrow.

There is a workaround (add the experimental feature) described here: #1032 (comment) if you would like to use parquet 8.0.0

@alamb alamb added the parquet Changes to the parquet crate label Feb 12, 2022
@ScottSyms
Copy link
Author

Awesome- thanks for the quick response!!

@alamb alamb changed the title module 'data_type' is private in Rust Parquet 8.0.0 module data_type is private in Rust Parquet 8.0.0 Feb 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug parquet Changes to the parquet crate
Projects
None yet
Development

No branches or pull requests

2 participants