-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(policy): Use search to fetch all policies #4713
fix(policy): Use search to fetch all policies #4713
Conversation
@@ -244,6 +237,9 @@ export const PoliciesPage = () => { | |||
content: `Are you sure you want to remove policy?`, | |||
onOk() { | |||
deletePolicy({ variables: { urn: policy?.urn as string } }); // There must be a focus policy urn. | |||
setTimeout(function () { | |||
policiesRefetch(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
@@ -282,6 +278,9 @@ export const PoliciesPage = () => { | |||
createPolicy({ variables: { input: toPolicyInput(savePolicy) } }); | |||
} | |||
message.success('Successfully saved policy.'); | |||
setTimeout(function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
@Searchable = { | ||
"fieldType": "DATETIME" | ||
} | ||
lastUpdatedTimestamp: optional long |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on lastModifiedMs -- only cause timestamp can mean a few formats, also to align with startTimeMs in the ExecutionRequestResult model that we use for sorting ingestion runs. Not a huge deal but let me know what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I just followed the one added to Operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found it https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/Operation.pdl#L18 This is where I got it from. I think we've been using "last updated" in the UI so going to keep it like this
} | ||
} | ||
|
||
private void addPolicyToCache(final Map<String, List<DataHubPolicyInfo>> cache, final EntityResponse entityResponse) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I love the idea of simplifying this.
_entityClient.batchGetV2(POLICY_ENTITY_NAME, new HashSet<>(policyUrns), null, authentication); | ||
return new PolicyFetchResult(policyUrns.stream() | ||
.map(policyEntities::get) | ||
.filter(Objects::nonNull) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qq - do we need double non null filter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if we just filtered for wheree apect map contains key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so just one filter that checks whether urns exist in the map and then entity response has the aspect? feel like efficiency wise this should be exactly the same
* Send MCLs for each policy to refill the policy search index | ||
*/ | ||
private void sendMCL() throws URISyntaxException { | ||
log.info("Pushing MCLs for all policies"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what policies? bootstrap policies? do they already exist inside the document store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving for record. The current policy index is empty bc there are no searchable fields. So this step makes sure we reingest the existing policies so they get in the index one time
* @param start start offset for search results | ||
* @param count max number of search results requested | ||
* @return Snapshot key | ||
* @throws RemoteInvocationException | ||
*/ | ||
@Nonnull | ||
public SearchResult search(@Nonnull String entity, @Nonnull String input, @Nullable Filter filter, int start, | ||
int count, @Nonnull Authentication authentication) throws RemoteInvocationException; | ||
public SearchResult search(@Nonnull String entity, @Nonnull String input, @Nullable Filter filter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! We also have a filter method which i've used for the same purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall LGTM - let's test on demo extensively before we cut a release :p
This reverts commit 8185ba4.
…b-project#4713)" (datahub-project#4725)" This reverts commit 3584d64.
* fix(policy): Use search to fetch all policies * Add updated timestamp * Change refetching logic and add timeout * Increase wait on smoke test
…t#4713)" (datahub-project#4725) This reverts commit 8185ba4.
Currently, we use listUrns function to fetch all policies. We realized this function does not scale well when a lot of entities have been ingested.
Instead, we will start using search to fetch all policies. Caveat, we have no Searchable annotations on any fields for policies, which means that the search index is currently empty. Modified the ingestPoliciesStep to send MCLs for the existing policies if the search index is empty, so that we fill up the search index before doing any fetching.
As a side, add a new searchable field called lastUpdatedTimestamp, and set it on any updates. List policies sorts based on this field so newly upserted policies get ranked above.
Checklist