Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ingest/s3): Not sorting schema fields to keep original order #9349

Merged
merged 11 commits into from
Jan 29, 2024

Conversation

treff7es
Copy link
Contributor

Not sorting schema fields to keep original order
Adding way to enable ** in path_spec include

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

Adding way to enable ** in path_spec include
@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Nov 30, 2023
@@ -63,6 +63,11 @@ class Config:
description="Not listing all the files but only taking a handful amount of sample file to infer the schema. File count and file size calculation will be disabled. This can affect performance significantly if enabled",
)

allow_double_stars: bool = Field(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is double star supported on all sources that use path spec? do we have tests on the data lake source that test the ** behavior for includes?

I know it always works in the exclude paths, but not include iirc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double star is a valid glob pattern and whichever source support glob, it should support double star as well.

@treff7es treff7es merged commit 90c8808 into datahub-project:master Jan 29, 2024
51 of 52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants