Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(upgrades) Create new DataHubUpgrade + Restore Glossary Entities Bootstrap step #5099

Merged

Conversation

chriscollins3456
Copy link
Collaborator

Creates a new entity called DataHubUpgrade which will be used in order to keep track of who has run what upgrade steps.

I'm creating a specific instance for glossary-upgrade where we will restore indices of Glossary Nodes Info aspects and Glossary Term Info aspects. Once a user boots up GMS, if they haven't run this bootstrap step they will get all of their Terms and nodes and restore their indices. All following times on bootup they will not run this step if there is already a glossary-upgrade DataHubUpgrade entity in mysql.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub


private int getAndRestoreTermAspectIndices(int start, AuditStamp auditStamp, AspectSpec termAspectSpec) throws Exception {
SearchResult termsResult = _entitySearchService.search(Constants.GLOSSARY_TERM_ENTITY_NAME, "", null, null, start, BATCH_SIZE);
List<Urn> termUrns = termsResult.getEntities().stream().map(SearchEntity::getEntity).collect(Collectors.toList());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's end early if termUrns is empty. Same with function below

log.info("Successfully restored glossary index");
} catch (Exception e) {
log.error("Error when running the RestoreGlossaryIndices Bootstrap Step", e);
_entityService.deleteUrn(GLOSSARY_UPGRADE_URN);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dexter-mh-lee @chriscollins3456 This is probably where a retry-policy for upgrades should come into the framework! :p

For later!

for (Urn nodeUrn: nodeUrns) {
EntityResponse nodeEntityResponse = nodeInfoResponses.get(nodeUrn);
if (nodeEntityResponse == null) {
log.info("Node not in set of entity responses {}", nodeUrn);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: consider logging a WARN log here

}
GlossaryNodeInfo nodeInfo = mapNodeInfo(nodeEntityResponse);
if (nodeInfo == null) {
log.info("Received null nodeInfo for urn {}", nodeUrn);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same consider a warn -- higher level of log

Copy link
Collaborator

@jjoyce0510 jjoyce0510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great!

Meta-comment: I would like to see us extract some of the boilerplate upgrade logic out into a separate place so that other steps can reuse that in the future.

@chriscollins3456
Copy link
Collaborator Author

Meta-comment: I would like to see us extract some of the boilerplate upgrade logic out into a separate place so that other steps can reuse that in the future.

Totally agreed - I think especially once we have another one of these changes, it'll be much easier to create shared logic. I tried thinking that through this time but with a lot of this stuff being newer I was really just focusing on getting things to work well for this step.

@jjoyce0510
Copy link
Collaborator

Once we pass CI we are good to merge! Thanks Chris

@github-actions
Copy link

github-actions bot commented Jun 6, 2022

Unit Test Results (build & test)

339 tests  +5   339 ✔️ +5   3m 33s ⏱️ +15s
  82 suites +4       0 💤 ±0 
  82 files   +4       0 ±0 

Results for commit e8cb4ad. ± Comparison against base commit 928db39.

@github-actions
Copy link

github-actions bot commented Jun 6, 2022

Unit Test Results (metadata ingestion)

       5 files         5 suites   1h 27m 59s ⏱️
   555 tests    552 ✔️     3 💤 0
2 490 runs  2 382 ✔️ 108 💤 0

Results for commit e8cb4ad.

@jjoyce0510 jjoyce0510 merged commit d22180e into datahub-project:master Jun 6, 2022
maggiehays pushed a commit to maggiehays/datahub that referenced this pull request Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants