Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python, rust): add feature operation #2712

Merged
merged 2 commits into from
Sep 18, 2024

Conversation

ion-elgreco
Copy link
Collaborator

@ion-elgreco ion-elgreco commented Jul 28, 2024

Description

Adds a new TableFeatures enum, that is a collection of all table features irrespective of reader/writer variant. Users simply select a feature they want, and we handle whether it needs to be added in readerFeatures or writerFeatures.

TableFeatures enum is exposed to python as well.

Related Issue(s)

Documentation

@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate labels Jul 28, 2024
@ion-elgreco ion-elgreco added this to the python v0.19 milestone Aug 3, 2024
@rtyler rtyler force-pushed the feat/add_feature_operation branch from c743970 to eeb903f Compare August 13, 2024 20:31
@rtyler rtyler modified the milestones: python v0.19, Rust v1.0.0 Aug 15, 2024
Copy link

codecov bot commented Sep 14, 2024

Codecov Report

Attention: Patch coverage is 62.12121% with 100 lines in your changes missing coverage. Please review.

Project coverage is 72.42%. Comparing base (83043b6) to head (e5fd2bc).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
crates/core/src/kernel/models/actions.rs 43.10% 30 Missing and 3 partials ⚠️
python/src/lib.rs 0.00% 26 Missing ⚠️
crates/core/src/operations/add_feature.rs 85.93% 15 Missing and 3 partials ⚠️
python/src/features.rs 0.00% 17 Missing ⚠️
crates/core/src/operations/convert_to_delta.rs 0.00% 1 Missing ⚠️
crates/core/src/operations/delete.rs 66.66% 0 Missing and 1 partial ⚠️
crates/core/src/operations/merge/mod.rs 66.66% 0 Missing and 1 partial ⚠️
crates/core/src/operations/update.rs 66.66% 0 Missing and 1 partial ⚠️
crates/core/src/operations/write.rs 75.00% 0 Missing and 1 partial ⚠️
crates/core/src/table/config.rs 87.50% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2712      +/-   ##
==========================================
+ Coverage   72.40%   72.42%   +0.02%     
==========================================
  Files         131      133       +2     
  Lines       40011    40238     +227     
  Branches    40011    40238     +227     
==========================================
+ Hits        28970    29143     +173     
- Misses       9165     9242      +77     
+ Partials     1876     1853      -23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

/// High level table features
#[derive(Serialize, Deserialize, Debug, Clone, Eq, PartialEq, Hash)]
#[serde(rename_all = "camelCase")]
pub enum TableFeatures {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should the relation between this enum and the DeltaConfigKey in src/table/config.rs be? 🤔 😕

}

/// Specify the features to be added
pub fn with_features<S: Into<TableFeatures>>(mut self, name: Vec<S>) -> Self {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just generally curious (this is by no means wrong), but why is the generic S ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember haha

Copy link
Collaborator

@hntd187 hntd187 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally like what this does, but I have some questions about it.

  1. If I understand correctly, this only allows you to add features, not remove them right?
  2. This doesn't appear to have any mapping of the feature to their specific reader/writer versions correct? Like if you enable CDF, that's 4,1 would this just make the protocol versions 7,3?

commit_properties: CommitProperties,
}

impl super::Operation<()> for AddTableFeatureBuilder {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing at the moment, but @rtyler wanted to use this at some point

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mostly placebo at the moment, but this ensures that the AddTableFeatureBuilder adheres to the Operation trait. There's more refactoring that's due to that Operation trait to clean things up 😄

.await
.unwrap();

let current_protocol = &result.protocol().cloned().unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what happens if you read a table with an unknown reader or writer property? Like a databricks table that has some new cool internal table feature in it. Does this fail?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't fail I think, since reader/writerFeature enum has a other, so those probably will end up in there, and just get tagged alon

@ion-elgreco
Copy link
Collaborator Author

I generally like what this does, but I have some questions about it.

  1. If I understand correctly, this only allows you to add features, not remove them right?
  2. This doesn't appear to have any mapping of the feature to their specific reader/writer versions correct? Like if you enable CDF, that's 4,1 would this just make the protocol versions 7,3?
  1. Yes, only adding for now. Removal should be also doable, but required more checks per feature
  2. Yup this is only tied to v3,7 system, I have never verified though what other engines such as spark do here

ion-elgreco and others added 2 commits September 17, 2024 23:38
@rtyler rtyler force-pushed the feat/add_feature_operation branch from d713ccd to e5fd2bc Compare September 17, 2024 23:38
@rtyler rtyler added this pull request to the merge queue Sep 18, 2024
Merged via the queue into delta-io:main with commit b3b23be Sep 18, 2024
24 checks passed
@rtyler rtyler modified the milestones: Rust v1.0.0, v0.23 Jan 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants