Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingestion/tableau): hidden asset handling #11559

Conversation

haeniya
Copy link
Contributor

@haeniya haeniya commented Oct 8, 2024

Description

This PR enables the configuration of how to handle hidden assets in Tableau. Views and Dashboards can be hidden within Workbooks. Hidden assets can be identified by a missing/blank luid, as stated in the documentation: https://help.tableau.com/current/api/metadata_api/en-us/reference/view.doc.html

image

This PR introduces two new config properties tags_for_hidden_assets and ingest_hidden_assets to better control what to do with hidden assets. They can either be skipped completely or tagged.

Configuration in the recipe could look like this:

source:
  type: tableau
  config:
    # Coordinates
    connect_uri: https://tableau.example.com

    # Handling of hidden assets
    tags_for_hidden_assets: ['private', 'hidden']
    # or completely skip hidden assets with
    # ingest_hidden_assets: False   

    # Credentials
    username: "${TABLEAU_USER}"
    password: "${TABLEAU_PASSWORD}"

sink:
  type: "datahub-rest"
  config:
    server: "http://localhost:8080"

Closes #6421

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added ingestion PR or Issue related to the ingestion of metadata community-contribution PR or Issue raised by member(s) of DataHub Community labels Oct 8, 2024
# Add hidden tags if sheet is hidden (blank luid)
tags.extend(self.config.tags_for_hidden_assets)

chart_snapshot.aspects.append(
Copy link
Contributor Author

@haeniya haeniya Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seemed to be a general issue with the tag ingestion. If all tags are removed in Tableau they wouldn't get removed in Datahub because tags are only saved if there are some tags present. Now, we basically always save the GlobalTags aspect even if there are no tags. But this allows us to properly reset the tags if they were removed from Tableau. I refactored this for all the assets with tags.

This resulted in quite a big changeset in the golden test files. Let me know if you see an issue with this. Thanks.

@haeniya
Copy link
Contributor Author

haeniya commented Oct 24, 2024

Hi @hsheth2, would be great to get your feedback on this. Many thanks.

Copy link
Collaborator

@mayurinehate mayurinehate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly good, requested minor change to avoid breaking change to default behaviour.

@hsheth2 hsheth2 added the pending-submitter-response Issue/request has been reviewed but requires a response from the submitter label Nov 20, 2024
@datahub-cyborg datahub-cyborg bot added needs-review Label for PRs that need review from a maintainer. and removed pending-submitter-response Issue/request has been reviewed but requires a response from the submitter labels Nov 29, 2024
@mayurinehate mayurinehate added pending-submitter-response Issue/request has been reviewed but requires a response from the submitter and removed needs-review Label for PRs that need review from a maintainer. labels Dec 2, 2024
@datahub-cyborg datahub-cyborg bot added needs-review Label for PRs that need review from a maintainer. and removed pending-submitter-response Issue/request has been reviewed but requires a response from the submitter labels Dec 3, 2024
@mayurinehate mayurinehate self-requested a review December 3, 2024 13:45
@datahub-cyborg datahub-cyborg bot added merge-pending-ci A PR that has passed review and should be merged once CI is green. and removed needs-review Label for PRs that need review from a maintainer. labels Dec 3, 2024
@haeniya
Copy link
Contributor Author

haeniya commented Dec 9, 2024

@mayurinehate and @hsheth2 anything missing here?

@mayurinehate
Copy link
Collaborator

Merging the PR now.

@mayurinehate mayurinehate merged commit 4811de1 into datahub-project:master Dec 9, 2024
74 checks passed
@YuriyGavrilov
Copy link
Contributor

Thank you a lot 🙏 @haeniya @mayurinehate

@mayurinehate
Copy link
Collaborator

@haeniya I would like to revisit and hear your thoughts on the need for config tags_for_hidden_assets as opposed to tagging hidden resources with a hard-coded "hidden" tag by default. In general - we are trying to minimize custom configs that do not yield much value in order to simplify the source configuration and future changes in code.

@haeniya
Copy link
Contributor Author

haeniya commented Dec 12, 2024

@haeniya I would like to revisit and hear your thoughts on the need for config tags_for_hidden_assets as opposed to tagging hidden resources with a hard-coded "hidden" tag by default. In general - we are trying to minimize custom configs that do not yield much value in order to simplify the source configuration and future changes in code.

Hi @mayurinehate, so in our case we want two tags, "private" and "hidden", to be added to hidden assets. We have a custom view that filters out "private" assets from the "public" catalog. We could also change the definition of the custom view to also exclude "hidden" assets but I think it makes sense to have this property anyway. I can imagine, there might be users that want this for example in a different language so I probably wouldn't hardcode it to just "hidden".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution PR or Issue raised by member(s) of DataHub Community ingestion PR or Issue related to the ingestion of metadata merge-pending-ci A PR that has passed review and should be merged once CI is green.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tableau ingestion didn't avoid unpublished sheets.
5 participants