-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(platform): add support for via nodes #9733
Conversation
1c5a94b
to
e7be03b
Compare
* The type of the entity to be grouped. | ||
* e.g. schemaField | ||
*/ | ||
rawEntityType: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one optimization on model:
make this optional, which means "applies to all entity types"
A grouping specification for search results | ||
""" | ||
input GroupingSpec { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we copy the PDL comment here
@@ -143,6 +143,11 @@ input SearchFlags { | |||
Whether to request for search suggestions on the _entityName virtualized field | |||
""" | |||
getSuggestions: Boolean | |||
|
|||
""" | |||
Additional grouping specifications to apply to the search results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add maybe note:
Notice: This API is experimental and subject to change.
""" | ||
The raw entity type that needs to be grouped | ||
""" | ||
rawEntityType: String! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EntityType -> GraphQL entity type enum
here is an example:
Conversion:
also, maybe we can call this baseEntityType
@@ -45,7 +45,8 @@ public void testDefaultSearchFlags() throws Exception { | |||
.setSkipAggregates(false) | |||
.setSkipHighlighting(true) // empty/wildcard | |||
.setMaxAggValues(20) | |||
.setSkipCache(false)); | |||
.setSkipCache(false) | |||
.setConvertSchemaFieldsToDatasets(true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this flag intentional
@@ -7,16 +7,23 @@ | |||
|
|||
@Slf4j | |||
public class GraphRelationshipMappingsBuilder { | |||
public static final String EDGE_FIELD_SOURCE = "source"; | |||
public static final String EDGE_FIELD_DESTINATION = "destination"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ty!
final InputFields inputFields = new InputFields(aspect.data()); | ||
updateInputFieldEdgesAndRelationships( | ||
urn, inputFields, edgesToAdd, urnToRelationshipTypesBeingAdded); | ||
} else if (aspectSpec.getName().equals(Constants.DATA_JOB_INPUT_OUTPUT_ASPECT_NAME)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
special cases... we'll need to come back on all of these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor naming comments but otherwise that looks good
metadata-io/src/main/java/com/linkedin/metadata/search/LineageSearchService.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once Ci is green, I think we are good to ship.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pending fixing the python smoke test failures related to the graphql api errors related to the changes here.
What are 'Via Nodes'
Via nodes
are entities designed to represent relationships between two distinct entities in the graph model. They typically represent processes like queries or data jobs that generate column-level lineage between fields, enhancing the model's capabilities in lineage representation. Via nodes are convenient performance and metadata representation shortcuts in cases where paths between two entities transit through a common node that is an intersection point for other paths.e.g. consider fine-grained lineage produced by a job J that looks like
T1 col c11, c12 --> (job J) --> T2 col c21, c22
and
T1 col c13, c14 --> (job J) --> T2 col 23
via nodes allow us to concisely represent and query these paths, without needing to create edge specific nodes (e.g. in this case, we do NOT need to mint fake ColumnProcess nodes J-1 , and J-2 to keep the paths separated unlike other metadata systems like Apache Atlas)
Caveat emptor:
via
nodes are experimental - and we'll revisit this decision in case this optimization doesn't yield as many benefits as we think it will.Summary
This pull request implements 'via nodes' within the graph model and updates the annotation language for compatibility, aiming to improve the model's data representation and querying capabilities.
Model Changes
Implementation Changes in GMS
Testing
Checklist