Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest/lookml): ingest field tags #10792

Merged

Conversation

sid-acryl
Copy link
Collaborator

@sid-acryl sid-acryl commented Jun 27, 2024

Tags defined on lookml view (https://cloud.google.com/looker/docs/reference/param-field-tags) will be ingested as DataHub tag on schema field.

Summary by CodeRabbit

  • New Features

    • Introduced tag management for Looker metadata ingestion, allowing user-defined tags on fields.
  • Bug Fixes

    • Enhanced error reporting and tag handling logic for Looker integration.
  • Tests

    • Updated integration tests to reflect tag management changes.
    • Added new test cases for field tag ingestion.
  • Chores

    • Removed empty globalTags fields from various test JSON files to clean up data structure.

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Jun 27, 2024
Copy link
Contributor

coderabbitai bot commented Jun 28, 2024

Walkthrough

The changes primarily involve enhancing the Looker integration by adding tag processing capabilities to the ViewField class and removing empty globalTags entries from various JSON test files. A new tag softVersion has been introduced, and related test cases have been updated to reflect these modifications.

Changes

File(s) Change Summary
…/looker_common.py, …/lookml_source.py Added tags attribute to ViewField class, updated _get_tags_from_field_type method to incorporate tags, and adjusted related processing logic. Default empty list for tags in field_dict.
…/integration/looker/golden_*.json, …/integration/lookml/lookml_*.json, …/lookml/refinements_ingestion_golden.json Removed globalTags field from multiple JSON objects, streamlining the data structure.
…/integration/lookml/duplicate_field_ingestion_golden.json, …/integration/lookml/lkml_samples_duplicate_field/… Added "softVersion" tag to version dimension in LookML files, created a new tag entity.
…/integration/lookml/field_tag_ingestion_golden.json Introduced comprehensive metadata including containers, datasets, snapshots, and tags for a LookML project.
…/integration/lookml/test_lookml.py Added test_field_tag_ingest function to validate field tag ingestion logic.

Poem

Beneath the Looker sky so bright,

Tags now flow like beams of light.

softVersion shines, a new addition,

Each field now with a clear mission.

Empty tags vanish, making way,

For structured paths in the data's play. 🌟 🐰


Tip

Early access features: enabled

We are currently testing the following features in early access:

  • OpenAI gpt-4o model for code reviews and chat: OpenAI claims that this model is better at understanding and generating code than the previous models. We seek your feedback over the next few weeks before making it generally available.

Note:

  • You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.
  • Please join our Discord Community to provide feedback and report issues.
  • OSS projects are currently opted into early access features by default.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Outside diff range and nitpick comments (3)
metadata-ingestion/src/datahub/ingestion/source/looker/looker_common.py (1)

Line range hint 1209-1209: Consider monitoring the memory usage of cached data.

The use of functools.lru_cache on get_explore could potentially lead to memory leaks if not managed correctly. Consider implementing monitoring to track the size of the cache and its impact on memory, especially if the cache size grows significantly.

metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py (2)

Line range hint 491-498: Refactor nested if statements for clarity.

Consider combining the nested if statements into a single if statement to improve code readability and maintainability.

- if isinstance(conn_map[key], str):
-     platform = conn_map[key]
-     if "." in platform:
-         platform_db_split = conn_map[key].split(".")
-         connection = LookerConnectionDefinition(
-             platform=platform_db_split[0],
-             default_db=platform_db_split[1],
-             default_schema="",
-         )
-         conn_map[key] = connection
+ if isinstance(conn_map[key], str) and "." in conn_map[key]:
+     platform_db_split = conn_map[key].split(".")
+     conn_map[key] = LookerConnectionDefinition(
+         platform=platform_db_split[0],
+         default_db=platform_db_split[1],
+         default_schema="",
+     )

Line range hint 1531-1533: Improve exception handling for clarity and debugging.

Utilize raise ... from err to provide better context in exception handling, distinguishing original exceptions from those raised during handling.

- except SDKError:
-     raise ValueError(
-         "Failed to retrieve connections from looker client. Please check to ensure that you have manage_models permission enabled on this API key."
-     )
+ except SDKError as e:
+     raise ValueError(
+         "Failed to retrieve connections from looker client. Please check to ensure that you have manage_models permission enabled on this API key."
+     ) from e

Also applies to: 1847-1849

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 45a8cc9 and 36c688a.

Files selected for processing (23)
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_common.py (3 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py (1 hunks)
  • metadata-ingestion/tests/integration/looker/golden_looker_mces.json (4 hunks)
  • metadata-ingestion/tests/integration/looker/golden_test_allow_ingest.json (2 hunks)
  • metadata-ingestion/tests/integration/looker/golden_test_external_project_view_mces.json (2 hunks)
  • metadata-ingestion/tests/integration/looker/golden_test_file_path_ingest.json (2 hunks)
  • metadata-ingestion/tests/integration/looker/golden_test_independent_look_ingest.json (2 hunks)
  • metadata-ingestion/tests/integration/looker/golden_test_ingest.json (2 hunks)
  • metadata-ingestion/tests/integration/looker/golden_test_ingest_joins.json (2 hunks)
  • metadata-ingestion/tests/integration/looker/golden_test_ingest_unaliased_joins.json (2 hunks)
  • metadata-ingestion/tests/integration/looker/looker_mces_golden_deleted_stateful.json (6 hunks)
  • metadata-ingestion/tests/integration/looker/looker_mces_usage_history.json (2 hunks)
  • metadata-ingestion/tests/integration/lookml/duplicate_field_ingestion_golden.json (3 hunks)
  • metadata-ingestion/tests/integration/lookml/expected_output.json (3 hunks)
  • metadata-ingestion/tests/integration/lookml/field_tag_ingestion_golden.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lkml_samples_duplicate_field/dataset_lineages.view.lkml (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_api_bigquery.json (3 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_api_hive2.json (3 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_offline.json (3 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json (3 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_with_external_urls.json (3 hunks)
  • metadata-ingestion/tests/integration/lookml/refinements_ingestion_golden.json (3 hunks)
  • metadata-ingestion/tests/integration/lookml/test_lookml.py (2 hunks)
Files skipped from review due to trivial changes (16)
  • metadata-ingestion/tests/integration/looker/golden_looker_mces.json
  • metadata-ingestion/tests/integration/looker/golden_test_allow_ingest.json
  • metadata-ingestion/tests/integration/looker/golden_test_external_project_view_mces.json
  • metadata-ingestion/tests/integration/looker/golden_test_file_path_ingest.json
  • metadata-ingestion/tests/integration/looker/golden_test_independent_look_ingest.json
  • metadata-ingestion/tests/integration/looker/golden_test_ingest.json
  • metadata-ingestion/tests/integration/looker/golden_test_ingest_joins.json
  • metadata-ingestion/tests/integration/looker/golden_test_ingest_unaliased_joins.json
  • metadata-ingestion/tests/integration/looker/looker_mces_usage_history.json
  • metadata-ingestion/tests/integration/lookml/expected_output.json
  • metadata-ingestion/tests/integration/lookml/lookml_mces_api_bigquery.json
  • metadata-ingestion/tests/integration/lookml/lookml_mces_api_hive2.json
  • metadata-ingestion/tests/integration/lookml/lookml_mces_offline.json
  • metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json
  • metadata-ingestion/tests/integration/lookml/lookml_mces_with_external_urls.json
  • metadata-ingestion/tests/integration/lookml/refinements_ingestion_golden.json
Additional context used
Ruff
metadata-ingestion/tests/integration/lookml/test_lookml.py

715-716: Use a single if statement instead of nested if statements

(SIM102)

metadata-ingestion/src/datahub/ingestion/source/looker/looker_common.py

1209-1209: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py

491-498: Use a single if statement instead of nested if statements

(SIM102)


1531-1533: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


1847-1849: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

Additional comments not posted (7)
metadata-ingestion/tests/integration/lookml/lkml_samples_duplicate_field/dataset_lineages.view.lkml (1)

43-43: Tag Addition Approved

The addition of the softVersion tag to the version dimension is correctly implemented according to LookML syntax.

metadata-ingestion/tests/integration/lookml/duplicate_field_ingestion_golden.json (1)

119-119: View Logic Representation Approved

The addition of the view logic in viewProperties is correctly serialized. Ensure that this JSON structure aligns with the expected schema in your integration tests.

Verification successful

View Logic Representation Approved

The addition of the view logic in viewProperties is correctly serialized. The JSON structure aligns with the expected schema in your integration tests.

  • The viewLogic contains the necessary fields and dimensions as expected.
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify that the JSON structure aligns with the expected schema.
echo "Run integration tests to ensure the JSON structure is correct."

Length of output: 134



Script:

#!/bin/bash
# Run the integration tests and capture their output to verify the JSON structure.
pytest metadata-ingestion/tests/integration/lookml/

Length of output: 127



Script:

#!/bin/bash
# Verify that the JSON structure aligns with the expected schema using jq.
jq empty metadata-ingestion/tests/integration/lookml/duplicate_field_ingestion_golden.json

Length of output: 92



Script:

#!/bin/bash
# Search for the schema definition in the codebase.
rg 'schema' metadata-ingestion/tests/integration/lookml/

Length of output: 148875



Script:

#!/bin/bash
# Extract schema fields from the duplicate_field_ingestion_golden.json file and compare them with the expected structure.
jq '.[] | .schemaName, .schemaField' metadata-ingestion/tests/integration/lookml/duplicate_field_ingestion_golden.json

Length of output: 260



Script:

#!/bin/bash
# Dump the entire JSON structure to understand its layout and locate the schema fields.
jq '.' metadata-ingestion/tests/integration/lookml/duplicate_field_ingestion_golden.json

Length of output: 16349



Script:

#!/bin/bash
# Extract schema fields from the aspect.json.fields within the SchemaMetadata aspect.
jq '[.[] | select(.aspectName == "viewProperties") | .aspect.json.viewLogic]' metadata-ingestion/tests/integration/lookml/duplicate_field_ingestion_golden.json

Length of output: 1024

metadata-ingestion/tests/integration/lookml/field_tag_ingestion_golden.json (1)

553-559: New Tag Entity Addition Approved

The addition of the softVersion tag entity is correctly implemented. Ensure that the tag data aligns with the expected attributes and values.

Verification successful

New Tag Entity Addition Approved

The addition of the softVersion tag entity is correctly implemented, and the tag data aligns with the expected attributes and values.

  • entityType: "tag"
  • entityUrn: "urn:li:tag:softVersion"
  • changeType: "UPSERT"
  • aspectName: "tagKey"
  • aspect:
    • json:
      • name: "softVersion"
  • systemMetadata:
    • lastObserved: 1586847600000
    • runId: "lookml-test"
    • lastRunId: "no-run-id-provided"
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify that the tag data aligns with the expected attributes and values.
echo "Run validation checks on the tag data."

Length of output: 86



Script:

#!/bin/bash
# Extract and display the tag data for `softVersion` from the `field_tag_ingestion_golden.json` file for verification.
jq '.[] | select(.entityUrn == "urn:li:tag:softVersion")' metadata-ingestion/tests/integration/lookml/field_tag_ingestion_golden.json

Length of output: 449

metadata-ingestion/tests/integration/looker/looker_mces_golden_deleted_stateful.json (1)

780-780: Deletion of Dashboard Approved

The deletion of the dashboard with urn:li:dashboard:(looker,dashboards.11) is noted. Ensure that this deletion does not affect any dependent systems or integrations.

Verification successful

Deletion of Dashboard Approved

The deletion of the dashboard with urn:li:dashboard:(looker,dashboards.11) is noted. The references found are within test files, indicating that the deletion is being tested. There is no impact on dependent systems or integrations.

  • metadata-ingestion/tests/integration/looker/looker_mces_golden_deleted_stateful.json
  • metadata-ingestion/tests/integration/looker/golden_test_allow_ingest.json
  • metadata-ingestion/tests/integration/looker/golden_looker_mces.json
  • metadata-ingestion/tests/integration/looker/test_looker.py
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the impact of the dashboard deletion on dependent systems.
echo "Check for any integrations or systems that depend on this dashboard."

Length of output: 146



Script:

#!/bin/bash
# Search for references to the specific dashboard URN in the codebase
rg 'urn:li:dashboard:\(looker,dashboards\.11\)'

Length of output: 1116

metadata-ingestion/tests/integration/lookml/test_lookml.py (1)

Line range hint 871-891: Well-structured test for duplicate field ingestion.

This test function is well-designed to check the ingestion process for handling duplicate fields. It follows good practices by comparing the output with a golden file to ensure consistency.

metadata-ingestion/src/datahub/ingestion/source/looker/looker_common.py (1)

247-247: Approved the addition of the tags attribute to the ViewField class.

This change effectively supports the ingestion of field-level tags from LookML into DataHub, which is aligned with the PR's objectives.

metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py (1)

1093-1093: Ensure correct handling of tags in _get_fields method.

The implementation correctly checks for the presence of tags in field_dict and assigns an empty list if it is absent. This is crucial for avoiding KeyError when tags is not provided.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 36c688a and c28e4f2.

Files selected for processing (3)
  • metadata-ingestion/tests/integration/looker/golden_looker_mces.json (4 hunks)
  • metadata-ingestion/tests/integration/looker/golden_test_independent_look_ingest.json (2 hunks)
  • metadata-ingestion/tests/integration/looker/looker_mces_golden_deleted_stateful.json (6 hunks)
Files skipped from review as they are similar to previous changes (3)
  • metadata-ingestion/tests/integration/looker/golden_looker_mces.json
  • metadata-ingestion/tests/integration/looker/golden_test_independent_look_ingest.json
  • metadata-ingestion/tests/integration/looker/looker_mces_golden_deleted_stateful.json

Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly looks good, just two small things

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between c28e4f2 and 954e544.

Files selected for processing (2)
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_common.py (3 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py (1 hunks)
Additional context used
Ruff
metadata-ingestion/src/datahub/ingestion/source/looker/looker_common.py

1204-1204: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py

491-498: Use a single if statement instead of nested if statements

(SIM102)


1531-1533: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


1847-1849: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

Additional comments not posted (2)
metadata-ingestion/src/datahub/ingestion/source/looker/looker_common.py (1)

247-247: LGTM! The addition of the tags attribute to ViewField is correct.

The default value of an empty list ensures backward compatibility.

metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py (1)

1093-1093: LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to _get_fields are correctly handled with the new tags attribute.

Verification successful

LGTM! The function calls to _get_fields handle the new tags attribute correctly.

The code changes are verified to be correctly integrated, ensuring the tags attribute is always present in the field_dict.

  • lookml_source.py:
    • Lines where _get_fields is called with looker_view dictionary keys.
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify all function calls to `_get_fields` are correctly handled with the new `tags` attribute.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type python $'_get_fields'

Length of output: 61



Script:

#!/bin/bash
# Description: Verify all function calls to `_get_fields` are correctly handled with the new `tags` attribute.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg '_get_fields'

Length of output: 983



Script:

#!/bin/bash
# Description: Extract lines around the calls to `_get_fields` in `lookml_source.py` to review the handling of the `tags` attribute.

# Extract 5 lines before and after the calls to `_get_fields`.
rg -C 5 '_get_fields' metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py

Length of output: 1315

Comment on lines +565 to +588
field: ViewField, reporter: SourceReport
) -> Optional[GlobalTagsClass]:
if field_type in LookerUtil.type_to_tag_map:
return GlobalTagsClass(
tags=[
schema_field_tags: List[TagAssociationClass] = [
TagAssociationClass(tag=builder.make_tag_urn(tag_name))
for tag_name in field.tags
]

if field.field_type in LookerUtil.type_to_tag_map:
schema_field_tags.extend(
[
TagAssociationClass(tag=tag_name)
for tag_name in LookerUtil.type_to_tag_map[field_type]
for tag_name in LookerUtil.type_to_tag_map[field.field_type]
]
)
else:
reporter.report_warning(
"lookml",
f"Failed to map view field type {field_type}. Won't emit tags for it",
f"Failed to map view field type {field.field_type}. Won't emit tags for measure and dimension",
)
return None

if schema_field_tags:
return GlobalTagsClass(tags=schema_field_tags)

return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify tag extraction logic.

The tag extraction logic can be simplified by combining the user-defined and predefined tags in a single list comprehension.

    @staticmethod
    def _get_tags_from_field_type(
        field: ViewField, reporter: SourceReport
    ) -> Optional[GlobalTagsClass]:
        schema_field_tags: List[TagAssociationClass] = [
            TagAssociationClass(tag=builder.make_tag_urn(tag_name))
            for tag_name in field.tags
        ]

        if field.field_type in LookerUtil.type_to_tag_map:
            schema_field_tags.extend(
                TagAssociationClass(tag=tag_name)
                for tag_name in LookerUtil.type_to_tag_map[field.field_type]
            )
        else:
            reporter.report_warning(
                "lookml",
                f"Failed to map view field type {field.field_type}. Won't emit tags for measure and dimension",
            )

        if schema_field_tags:
            return GlobalTagsClass(tags=schema_field_tags)

        return None
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
field: ViewField, reporter: SourceReport
) -> Optional[GlobalTagsClass]:
if field_type in LookerUtil.type_to_tag_map:
return GlobalTagsClass(
tags=[
schema_field_tags: List[TagAssociationClass] = [
TagAssociationClass(tag=builder.make_tag_urn(tag_name))
for tag_name in field.tags
]
if field.field_type in LookerUtil.type_to_tag_map:
schema_field_tags.extend(
[
TagAssociationClass(tag=tag_name)
for tag_name in LookerUtil.type_to_tag_map[field_type]
for tag_name in LookerUtil.type_to_tag_map[field.field_type]
]
)
else:
reporter.report_warning(
"lookml",
f"Failed to map view field type {field_type}. Won't emit tags for it",
f"Failed to map view field type {field.field_type}. Won't emit tags for measure and dimension",
)
return None
if schema_field_tags:
return GlobalTagsClass(tags=schema_field_tags)
return None
@staticmethod
def _get_tags_from_field_type(
field: ViewField, reporter: SourceReport
) -> Optional[GlobalTagsClass]:
schema_field_tags: List[TagAssociationClass] = [
TagAssociationClass(tag=builder.make_tag_urn(tag_name))
for tag_name in field.tags
]
if field.field_type in LookerUtil.type_to_tag_map:
schema_field_tags.extend(
TagAssociationClass(tag=tag_name)
for tag_name in LookerUtil.type_to_tag_map[field.field_type]
)
else:
reporter.report_warning(
"lookml",
f"Failed to map view field type {field.field_type}. Won't emit tags for measure and dimension",
)
if schema_field_tags:
return GlobalTagsClass(tags=schema_field_tags)
return None

@@ -602,7 +612,7 @@
else f"{field.field_type.value}. {field.description}"
),
globalTags=(
LookerUtil._get_tags_from_field_type(field.field_type, reporter)
LookerUtil._get_tags_from_field_type(field, reporter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve readability with ternary operators.

Use ternary operators to improve readability for the description and globalTags fields.

    @staticmethod
    def view_field_to_schema_field(
        field: ViewField,
        reporter: SourceReport,
        tag_measures_and_dimensions: bool = True,
    ) -> SchemaField:
        return SchemaField(
            fieldPath=field.name,
            type=LookerUtil._get_field_type(field.type, reporter),
            nativeDataType=field.type,
            label=field.label,
            description=(
                field.description
                if tag_measures_and_dimensions
                else f"{field.field_type.value}. {field.description}"
            ),
            globalTags=(
                LookerUtil._get_tags_from_field_type(field, reporter)
                if tag_measures_and_dimensions
                else None
            ),
            isPartOfKey=field.is_primary_key,
        )
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
LookerUtil._get_tags_from_field_type(field, reporter)
LookerUtil._get_tags_from_field_type(field, reporter)
@staticmethod
def view_field_to_schema_field(
field: ViewField,
reporter: SourceReport,
tag_measures_and_dimensions: bool = True,
) -> SchemaField:
return SchemaField(
fieldPath=field.name,
type=LookerUtil._get_field_type(field.type, reporter),
nativeDataType=field.type,
label=field.label,
description=(
field.description
if tag_measures_and_dimensions
else f"{field.field_type.value}. {field.description}"
),
globalTags=(
LookerUtil._get_tags_from_field_type(field, reporter)
if tag_measures_and_dimensions
else None
),
isPartOfKey=field.is_primary_key,
)

@hsheth2 hsheth2 merged commit 3e47496 into datahub-project:master Jul 1, 2024
58 checks passed
yoonhyejin pushed a commit that referenced this pull request Jul 16, 2024
aviv-julienjehannet pushed a commit to aviv-julienjehannet/datahub that referenced this pull request Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants