-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(policies): Allow policies to be applied to resources based on tags #9684
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
65 changes: 65 additions & 0 deletions
65
...c/main/java/com/datahub/authorization/fieldresolverprovider/TagFieldResolverProvider.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
package com.datahub.authorization.fieldresolverprovider; | ||
|
||
import com.datahub.authentication.Authentication; | ||
import com.datahub.authorization.EntityFieldType; | ||
import com.datahub.authorization.EntitySpec; | ||
import com.datahub.authorization.FieldResolver; | ||
import com.linkedin.common.GlobalTags; | ||
import com.linkedin.common.urn.Urn; | ||
import com.linkedin.common.urn.UrnUtils; | ||
import com.linkedin.entity.EntityResponse; | ||
import com.linkedin.entity.EnvelopedAspect; | ||
import com.linkedin.entity.client.EntityClient; | ||
import com.linkedin.metadata.Constants; | ||
import java.util.Collections; | ||
import java.util.List; | ||
import java.util.stream.Collectors; | ||
import lombok.RequiredArgsConstructor; | ||
import lombok.extern.slf4j.Slf4j; | ||
|
||
/** Provides field resolver for owners given entitySpec */ | ||
@Slf4j | ||
@RequiredArgsConstructor | ||
public class TagFieldResolverProvider implements EntityFieldResolverProvider { | ||
|
||
private final EntityClient _entityClient; | ||
private final Authentication _systemAuthentication; | ||
|
||
@Override | ||
public List<EntityFieldType> getFieldTypes() { | ||
return Collections.singletonList(EntityFieldType.TAG); | ||
} | ||
|
||
@Override | ||
public FieldResolver getFieldResolver(EntitySpec entitySpec) { | ||
return FieldResolver.getResolverFromFunction(entitySpec, this::getTags); | ||
} | ||
|
||
private FieldResolver.FieldValue getTags(EntitySpec entitySpec) { | ||
Urn entityUrn = UrnUtils.getUrn(entitySpec.getEntity()); | ||
EnvelopedAspect globalTagsAspect; | ||
try { | ||
EntityResponse response = | ||
_entityClient.getV2( | ||
entityUrn.getEntityType(), | ||
entityUrn, | ||
Collections.singleton(Constants.GLOBAL_TAGS_ASPECT_NAME), | ||
_systemAuthentication); | ||
if (response == null | ||
|| !response.getAspects().containsKey(Constants.GLOBAL_TAGS_ASPECT_NAME)) { | ||
return FieldResolver.emptyFieldValue(); | ||
} | ||
globalTagsAspect = response.getAspects().get(Constants.GLOBAL_TAGS_ASPECT_NAME); | ||
} catch (Exception e) { | ||
log.error("Error while retrieving tags aspect for urn {}", entityUrn, e); | ||
return FieldResolver.emptyFieldValue(); | ||
} | ||
GlobalTags globalTags = new GlobalTags(globalTagsAspect.getValue().data()); | ||
return FieldResolver.FieldValue.builder() | ||
.values( | ||
globalTags.getTags().stream() | ||
.map(tag -> tag.getTag().toString()) | ||
.collect(Collectors.toSet())) | ||
.build(); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an inefficient design. The urns and their aspect reads should be batched and not fetching each aspect one by one for each urn. I know this isn't part of the scope for this PR, but authentication causes latency on practically every api call so it has a high impact. Even if we don't batch urns, we should definitely batch the aspects There should be a batched resolver for domain, owner, data platform, group membership, and tags. All these aspects should be fetched in batch using the entity client's same getV2 method just using the collection as something other then a single item
Collections.singleton(
This is a perfect opportunity to address the tech debt here by adding the extra aspect and improve performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading the adjacent code, I don't think we are sequentially performing each of these resolvers for each request. Rather in
getFieldResolvers
(in this file) we are creating a dictionary of what query to run based on the EntityFieldType that an incoming request has in their filter.See
datahub/metadata-auth/auth-api/src/main/java/com/datahub/authorization/ResolvedEntitySpec.java
Line 25 in 943bb57
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That line is executing a request per field meaning separate requests for each aspect. It is using futures, but not batching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fieldResolvers.get(entityFieldType)
is only getting 1 resolver and generating 1 future, which depends on the filter being analysed.I think this is the line responsible for multiple possible requests:
datahub/metadata-service/auth-impl/src/main/java/com/datahub/authorization/PolicyEngine.java
Line 230 in 943bb57
I think any performance improvement here would come from policies that have multiple filter conditions like:
contains tag X AND is part of Domain Y AND is OwnedBy z
, here 3 calls would happen sequentially (one of the them, domain, is out of scope). The remaining filterResolvers are either not used or make no call to the entity service.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, if a single policy operates on 3 fields it would benefit from fetching all three at the same time. If there are multiple policies which depend on 3 different fields, it would also benefit from fetching all three at the same time. The general idea is to fetch the required aspects in 1 call for the policies being evaluated rather then 3 subsequent calls. I am assuming the same resource is being passed to each policy and that we would not be fetching the same aspect for each policy already today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point in the code, only a single Policy’s filter gets evaluated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going down the route of evaluating all policies to check whether they apply for a given resource request is a large refactor of the policy engine. I would suggest a design document before doing anything else.
@RyanHolstien @david-leifker do we have metrics on how much the PolicyEngine is a performance bottleneck?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens in this case if no policies are authored with any tag predicates on them? Does the code performance stay the same as today or degrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code performs the same as today. The field resolver for tags is not invoked.