Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Dataplex Datascan as a resource #7911

Closed
wants to merge 94 commits into from

Conversation

shourya116
Copy link

Adding a new Dataplex resource: Datascan.

If this PR is for Terraform, I acknowledge that I have:

  • Searched through the issue tracker for an open issue that this either resolves or contributes to, commented on it to claim it, and written "fixes {url}" or "part of {url}" in this PR description. If there were no relevant open issues, I opened one and commented that I would like to work on it (not necessary for very small changes).
  • Ensured that all new fields I added that can be set by a user appear in at least one example (for generated resources) or third_party test (for handwritten resources or update tests).
  • Generated Terraform providers, and ran make test and make lint in the generated providers to ensure it passes unit and linter tests.
  • Ran relevant acceptance tests using my own Google Cloud project and credentials (If the acceptance tests do not yet pass or you are unable to run them, please let your reviewer know).
  • Read the Release Notes Guide before writing my release note below.

Release Note Template for Downstream PRs (will be copied)

`google_dataplex_datascan_*`

@modular-magician modular-magician requested a review from megan07 May 10, 2023 06:43
@modular-magician
Copy link
Collaborator

Hello! I am a robot who works on Magic Modules PRs.

I've detected that you're a community contributor. @megan07, a repository maintainer, has been assigned to assist you and help review your changes.

❓ First time contributing? Click here for more details

Your assigned reviewer will help review your code by:

  • Ensuring it's backwards compatible, covers common error cases, etc.
  • Summarizing the change into a user-facing changelog note.
  • Passes tests, either our "VCR" suite, a set of presubmit tests, or with manual test runs.

You can help make sure that review is quick by running local tests and ensuring they're passing in between each push you make to your PR's branch. Also, try to leave a comment with each push you make, as pushes generally don't generate emails.

If your reviewer doesn't get back to you within a week after your most recent change, please feel free to leave a comment on the issue asking them to take a look! In the absence of a dedicated review dashboard most maintainers manage their pending reviews through email, and those will sometimes get lost in their inbox.


@modular-magician modular-magician added the awaiting-approval Pull requests that need reviewer's approval to run presubmit tests label May 10, 2023
@shourya116 shourya116 changed the title Full resource Adding Dataplex Datascan as a resource May 10, 2023
The data source for DataScan.
properties:
- !ruby/object:Api::Type::String
name: 'entity'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this required as well, please?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, we plan to additional sources pretty soon.

And then the semantic will be "one-of" entity / new-property.

In this scenario, we shouldn't add required to entity right now, since dropping required later will be a backward-incompatible change, right?

Spec related to how often and when a scan should be triggered. If not specified, the default is OnDemand, which means the scan will not run until the user calls dataScans.run API.
properties:
- !ruby/object:Api::Type::NestedObject
name: 'onDemand'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does onDemand have a nested cron string as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onDemand (and several RuleType fields below are currently an empty message with no properties;
The idea is to keep it as a message so that we can add fields to them later if needed.

User still needs to specify that they want onDemand instead of Scheduled though.


For empty messages in the API, should we just have NestedObject with no property in the YAML?

mmv1/products/dataplex/Datascan.yaml Show resolved Hide resolved
mmv1/products/dataplex/Datascan.yaml Show resolved Hide resolved
mmv1/products/dataplex/Datascan.yaml Show resolved Hide resolved
Whether each value needs to be strictly lesser than ('<') the maximum, or if equality is allowed.
Only relevant if a maxValue has been defined. Default = false.
- !ruby/object:Api::Type::NestedObject
name: 'nonNullExpectation'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll need the nested values list here too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe this is an empty message, so it should be fine?

description: |
A regular expression the column value is expected to match.
- !ruby/object:Api::Type::NestedObject
name: 'uniquenessExpectation'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll want the nested object defined here as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe this is an empty message, so it should be fine?

Whether the rule passed or failed.
- !ruby/object:Api::Type::String
name: 'evaluatedCount'
default_values: 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is output, we shouldn't have a default here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

description: |
The range denoted by values of an incremental field
properties:
- !ruby/object:Api::Type::string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we capitalize string here and with its siblings?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

display_name = "Test Datascan"

data = {
entity = "projects/<%= ctx[:test_env_vars]['project_name'] %>/locations/us-central1/lakes/amandeep-dev-lake/zones/rawzone/entities/austin_311"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not hardcode these values?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to make sure I understand, we have to create a datascan against a source

are you suggesting we create a new data source (entity) and then reference that as the entity in the datascan
(or possibly reference a public entity)?


Here, it looks like there's a hard-coded entity, which the tests wont have reference to during automated tests, and hence will fail?

Copy link
Contributor

@saurabh-net saurabh-net left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments @megan07! I left a few questions.

mmv1/products/dataplex/Datascan.yaml Show resolved Hide resolved
mmv1/products/dataplex/Datascan.yaml Show resolved Hide resolved
mmv1/products/dataplex/Datascan.yaml Show resolved Hide resolved
name: 'threshold'
description: |
The minimum ratio of passing_rows / total_rows required to pass this rule, with a range of [0.0, 1.0]. 0 indicates default value (i.e. 1.0).
default_value: 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(will double check this and update here)

The API behavior is that specifying 0, or not specifying a value at all, both cause the system to interpret the threshold as 1.0.

The API Read behavior however should return whatever the user specified --- i.e. 0 / unspecified.

- !ruby/object:Api::Type::NestedObject
name: 'nonNullExpectation'
description: |
ColumnMap rule which evaluates whether each column value is null.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another example of an empty message in the API; We may add future fields / properties to this message later, but it's currently empty;

Whether each value needs to be strictly lesser than ('<') the maximum, or if equality is allowed.
Only relevant if a maxValue has been defined. Default = false.
- !ruby/object:Api::Type::NestedObject
name: 'nonNullExpectation'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe this is an empty message, so it should be fine?

description: |
A regular expression the column value is expected to match.
- !ruby/object:Api::Type::NestedObject
name: 'uniquenessExpectation'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe this is an empty message, so it should be fine?

Whether the rule passed or failed.
- !ruby/object:Api::Type::String
name: 'evaluatedCount'
default_values: 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

description: |
The range denoted by values of an incremental field
properties:
- !ruby/object:Api::Type::string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

display_name = "Test Datascan"

data = {
entity = "projects/<%= ctx[:test_env_vars]['project_name'] %>/locations/us-central1/lakes/amandeep-dev-lake/zones/rawzone/entities/austin_311"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to make sure I understand, we have to create a datascan against a source

are you suggesting we create a new data source (entity) and then reference that as the entity in the datascan
(or possibly reference a public entity)?


Here, it looks like there's a hard-coded entity, which the tests wont have reference to during automated tests, and hence will fail?

Copy link
Contributor

@saurabh-net saurabh-net left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments @megan07! I left a few questions.

@modular-magician modular-magician requested a review from megan07 May 17, 2023 17:09
@@ -5,7 +5,7 @@ resource "google_dataplex_datascan" "<%= ctx[:primary_resource_id] %>" {
display_name = "Test Datascan"

data = {
entity = "projects/<%= ctx[:test_env_vars]['project_name'] %>/locations/us-central1/lakes/amandeep-dev-lake/zones/rawzone/entities/austin_311"
entity = "projects/dataplex-clouddq/locations/us-central1/lakes/amandeep-dev-lake/zones/rawzone/entities/austin_311"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure we want to make this change, we want to keep the ctx[:test_env_vars]['project_name'] and probably do something similar with locations, lakes, zones, and entities. Does that make sense? I'm imagining

"projects/<%= ctx[:test_env_vars]['project_name'] %>/locations/ctx[:test_env_vars]['region']/lakes/ctx[:vars]['lake']/zones/ctx[:vars]['zone']/entities/ctx[:vars]['entity']"

where region is defined in the test_env_vars and lake, zone and entity are defined nearby, but in a similar vars block.

Let me know if you have questions around that! Thanks!

@megan07
Copy link
Contributor

megan07 commented May 24, 2023

@saurabh-net (edit: referenced the wrong user, oops!) sorry! i must have missed this comment yesterday in my review.

are you suggesting we create a new data source (entity) and then reference that as the entity in the datascan
(or possibly reference a public entity)?

yes, i supposed if this is a hard-coded public entity, that will suffice. i'm not super familiar with the resource to say which is better, but typically, if possible in terraform, we'd create the entity to reference and then add that here.

Here, it looks like there's a hard-coded entity, which the tests wont have reference to during automated tests, and hence will fail?

that's the concern, we want our tests to be able to access the entity referenced here

@modular-magician modular-magician requested a review from megan07 May 25, 2023 06:52
maphad and others added 5 commits May 25, 2023 13:07
* Configuration for traffic director Mesh resource.

* Add more tests for Mesh resource

* Use new provider for test

* Configuration for service binding.

* Add hand written test for service binding.

* Revert "Add hand written test for service binding."

This reverts commit 367449a.

* Update service binding yaml.

* reduce timeouts.

* Update mmv1/products/networkservices/ServiceBinding.yaml

Co-authored-by: Sam Levenick <[email protected]>

* Service binding update test.

* Fix compile error.

* Mark the Service Binding resource immutable.

* Add configuration for server tls policy.

* Remove extra files.

* Add tests.

* Revert "Remove extra files."

This reverts commit 0791d44.

* Update mmv1/third_party/terraform/tests/resource_network_security_server_tls_policy_test.go.erb

Co-authored-by: Sarah French <[email protected]>

* Address PR comments.

* Server Tls Policy

* Address PR comments.

* Address PR comments.

---------

Co-authored-by: Madhura Phadnis <[email protected]>
Co-authored-by: Sam Levenick <[email protected]>
Co-authored-by: Sarah French <[email protected]>
…oogleCloudPlatform#7858)

* Add ICEBERG as a supported external data format for bigquery tables
* updated descriptions to link api reference
NickElliot and others added 22 commits May 25, 2023 13:07
* Fix path for yaml

* Use config

* Add container to call workflow

* Apply suggestions from code review

Co-authored-by: Scott Suarez <[email protected]>

---------

Co-authored-by: Scott Suarez <[email protected]>
* Move files to tpgiamresource package

* Add docs

* Use type alias
* fixed accesscontextmanager spec requirements

fixes hashicorp/terraform-provider-google#14612

* fixed google_access_context_manager_service_perimeter as well
@megan07
Copy link
Contributor

megan07 commented May 25, 2023

Hi @shourya116 , would you please rebase your changes with main? I think that will make the review a bit easier for me. Thanks!

@saurabh-net
Copy link
Contributor

Hey @megan07 @shuyama1 @shourya116 ,
I'll be taking over this PR at #8010

We can move this PR to draft mode for now.

@rileykarson
Copy link
Member

Superseded by #8010

@rileykarson rileykarson closed this Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-approval Pull requests that need reviewer's approval to run presubmit tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.