Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic AVRO files (translated copies of the parquet testing files to avro) #62

Merged
merged 4 commits into from
Sep 9, 2021

Conversation

Igosuki
Copy link
Contributor

@Igosuki Igosuki commented Aug 26, 2021

N.B. : I used spark for the translation so there is some additional metadata in the files, but they can be removed.

@alamb alamb changed the title These are translated copies of the parquet testing files to avro. Add basic AVRO files (translated copies of the parquet testing files to avro) Aug 29, 2021
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pitrou / @kiszk do you have any concerns or suggestions for adding several smaller AVRO files into the testing repository? They are used for apache/datafusion#910 and we may consider adding avro support to the main apache-rs repo as well.

@pitrou
Copy link
Member

pitrou commented Aug 29, 2021

This seems fine to me, but can you add a README explaining what these files are and how they were obtained?

@kiszk
Copy link
Member

kiszk commented Aug 29, 2021

Looks good to me

@alamb
Copy link
Contributor

alamb commented Aug 30, 2021

@Igosuki -- I added a basic README in 8d306ef -- can you provide the command you used to create these files from the original parquet?

Thanks!

@Igosuki
Copy link
Contributor Author

Igosuki commented Aug 31, 2021

@Igosuki
Copy link
Contributor Author

Igosuki commented Aug 31, 2021

It would be possible to use arrow-python and fastavro to achieve the same, I just have a lot of Spark experience and I prefer typed so I went that way.

@alamb
Copy link
Contributor

alamb commented Sep 9, 2021

Thanks @Igosuki ! I am sorry for the delayed response -- I am catching up from being on vacation and hope to help push your contributions over the line real soon now

data/avro/README.md Outdated Show resolved Hide resolved
@alamb alamb merged commit 1ec12d1 into apache:master Sep 9, 2021
@Igosuki
Copy link
Contributor Author

Igosuki commented Sep 10, 2021

All good 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants