Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ingestion): ensure source/sink reports are always logged #4592

Merged
merged 10 commits into from
Apr 12, 2022

Conversation

anshbansal
Copy link
Collaborator

Sometimes the report does not get printed which makes it hard to debug problems. This should help with that.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.

@github-actions
Copy link

github-actions bot commented Apr 6, 2022

Unit Test Results (build & test)

  96 files    96 suites   17m 8s ⏱️
689 tests 630 ✔️ 59 💤 0

Results for commit 0ef35db.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Apr 6, 2022

Unit Test Results (metadata ingestion)

       5 files  ±  0         5 suites  ±0   56m 8s ⏱️ - 10m 30s
   396 tests +  2     396 ✔️ +  2    0 💤 ±  0  0 ±0 
1 911 runs  +97  1 847 ✔️ +64  64 💤 +33  0 ±0 

Results for commit 0ef35db. ± Comparison against base commit df1d8ad.

♻️ This comment has been updated with latest results.

minor updates to language
pipeline.run()
logger.info("Finished metadata pipeline")
except Exception as e:
logger.error(f"Caught exception while running metadata ingestion: {e}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's log the error and rethrow so that the process doesn't exit 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we don't want to capture misconfigurations by the user in the telemetry. So, re-throwing is the right thing to do here.

Copy link
Collaborator Author

@anshbansal anshbansal Apr 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not want to lose pipeline.pretty_print_summary is the main reason I did not re-throw the exception. I pushed a change for non-zero return code. Let me know if that takes care of the concern.

@anshbansal anshbansal changed the title fix(ingestion): add error handling on whole pipeline fix(ingestion): ensure source/sink reports are always logged Apr 8, 2022
@rslanka rslanka merged commit 23ece3b into datahub-project:master Apr 12, 2022
@anshbansal anshbansal deleted the add-exception-handling branch April 12, 2022 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants