Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added read_schema #196

Closed
wants to merge 2 commits into from
Closed

Conversation

jorgecarleitao
Copy link

This PR moves the functionality to read the header of a block to outside of the Block struct, so that users can read the schema without having to initialize a Reader.

Users will need to re-seek the file if they want to then pass it to Reader, but the primary goal here is to offer users the ability to read the schema, e.g. to build logical plans based on the file.

Built on top of #194

@Dandandan
Copy link

@iemejia @jorgecarleitao

Has the official Rust implementation moved to this repo / space after the donation to Apache?

https://github.com/apache/avro/tree/master/lang/rust

If so, think it makes sense to update some pointers in this repository.

@Dandandan
Copy link

@iemejia
Copy link

iemejia commented Sep 17, 2021

For some extra context the idea we were looking for was to have a consolidated upstream Apache Avro Rust implementation, so we contacted @flavray and the Yelp authors to have the code donated and fortunately this happened succesfully. Migration to Apache Avro of the codebase already happened. However we have not done yet the first release so these changes can probably still get in. Some things did not happen as planned (due to multiple reasons and personal changes), we are lacking maintainers so any help will be welcomed. We expect to have a new Avro release that will include the rust Avro version 'soon' so if you do the changes there we can get them in.

Slightly unrelated but good for awareness: We expect the first release to be fully consistent with the avro-rs APIs but this might change in the future, The Materialize team has a forked implementation of this repo with many incompatible API changes but with some niceties like better Avro format support + faster encoding/decoding that they were willing to donate at some point, but sadly because of other priorities nobody has worked on moving that code into the Apache side, but well is the same issue we need to get more maintainers, sadly it is not only about putting the code in.

CC @RyanSkraba since we are syncing about the next release

@Dandandan
Copy link

Thanks @iemejia for the full context and status of the Rust implementation!
Getting the Materialize version in seems great.

Also let me know when I should give access to avro-tools on crates.io.

@Igosuki you recently contributed the Avro table provider in DataFusion - maybe you're interested as well to help out on the Apache Avro side :D?

@Igosuki
Copy link

Igosuki commented Sep 18, 2021

I didn't know this is being migrated to apache ! I have a fork where I added protocol support for schemas generated from idl here Igosuki@9f51ffa let me know if there's anything I can help with

@jorgecarleitao
Copy link
Author

I can help with the maintenance. After jorgecarleitao/arrow2#406 I am familiar with the format and how each type is encoded (and the code in avro-rs is very easy to follow, I must say =)

  • Is there an option to continue using github issues (instead of JIRA)?
  • could you not release under the same cadence of other implementations and/or format? Being on 0.X helps

@iemejia
Copy link

iemejia commented Oct 12, 2021

I somehow missed the rest of the conversation. @Igosuki @jorgecarleitao would you be interested in taking the maintenance on the Apache side? It will be fantastic to have more hands helping, sadly things have not gone as expected since the move to Apache.

  • Is there an option to continue using github issues (instead of JIRA)?
    For JIRA it is going to be hard because of consistency with the rest of Avro.

could you not release under the same cadence of other implementations and/or format? Being on 0.X helps

For the release cadence this is up to the maintainers needs, at this point we plan to do the first release with 1.11.0 but then we could maybe go in with rust specific releases if required, the issue is the usual of the Apache 72h vote.

Also there are two things Avro does NOT follow semver and for the particular case of the Rust implementation we do not want to offer strong stability guarantees to help new contributors evolve the implementation.

I had a short discussion with @flavray last week and he seems to be interested on taking back some review tasks on the Apache branch so with a little bit of help we could get things again rolling.

If you are in for helping please write here or ping me at the ASF slack and we can discuss more.

@Igosuki
Copy link

Igosuki commented Oct 13, 2021

I am interested on helping maintain the apache side. I don't know what you planned for future releases but I already have quite a few things in mind aiming to attain feature parity with Java.

@jorgecarleitao jorgecarleitao closed this by deleting the head repository Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants