-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Dataplex Datascan as a resource #7911
Conversation
Hello! I am a robot who works on Magic Modules PRs. I've detected that you're a community contributor. @megan07, a repository maintainer, has been assigned to assist you and help review your changes. ❓ First time contributing? Click here for more detailsYour assigned reviewer will help review your code by:
You can help make sure that review is quick by running local tests and ensuring they're passing in between each push you make to your PR's branch. Also, try to leave a comment with each push you make, as pushes generally don't generate emails. If your reviewer doesn't get back to you within a week after your most recent change, please feel free to leave a comment on the issue asking them to take a look! In the absence of a dedicated review dashboard most maintainers manage their pending reviews through email, and those will sometimes get lost in their inbox. |
The data source for DataScan. | ||
properties: | ||
- !ruby/object:Api::Type::String | ||
name: 'entity' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make this required as well, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, we plan to additional sources pretty soon.
And then the semantic will be "one-of" entity / new-property.
In this scenario, we shouldn't add required to entity right now, since dropping required later will be a backward-incompatible change, right?
Spec related to how often and when a scan should be triggered. If not specified, the default is OnDemand, which means the scan will not run until the user calls dataScans.run API. | ||
properties: | ||
- !ruby/object:Api::Type::NestedObject | ||
name: 'onDemand' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does onDemand have a nested cron
string as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
onDemand (and several RuleType fields below are currently an empty message with no properties;
The idea is to keep it as a message so that we can add fields to them later if needed.
User still needs to specify that they want onDemand instead of Scheduled though.
For empty messages in the API, should we just have NestedObject with no property in the YAML?
Whether each value needs to be strictly lesser than ('<') the maximum, or if equality is allowed. | ||
Only relevant if a maxValue has been defined. Default = false. | ||
- !ruby/object:Api::Type::NestedObject | ||
name: 'nonNullExpectation' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll need the nested values list here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe this is an empty message, so it should be fine?
description: | | ||
A regular expression the column value is expected to match. | ||
- !ruby/object:Api::Type::NestedObject | ||
name: 'uniquenessExpectation' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll want the nested object defined here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe this is an empty message, so it should be fine?
Whether the rule passed or failed. | ||
- !ruby/object:Api::Type::String | ||
name: 'evaluatedCount' | ||
default_values: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is output, we shouldn't have a default here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
mmv1/products/dataplex/Datascan.yaml
Outdated
description: | | ||
The range denoted by values of an incremental field | ||
properties: | ||
- !ruby/object:Api::Type::string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we capitalize string
here and with its siblings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
display_name = "Test Datascan" | ||
|
||
data = { | ||
entity = "projects/<%= ctx[:test_env_vars]['project_name'] %>/locations/us-central1/lakes/amandeep-dev-lake/zones/rawzone/entities/austin_311" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we not hardcode these values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to make sure I understand, we have to create a datascan against a source
are you suggesting we create a new data source (entity) and then reference that as the entity in the datascan
(or possibly reference a public entity)?
Here, it looks like there's a hard-coded entity, which the tests wont have reference to during automated tests, and hence will fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments @megan07! I left a few questions.
mmv1/products/dataplex/Datascan.yaml
Outdated
name: 'threshold' | ||
description: | | ||
The minimum ratio of passing_rows / total_rows required to pass this rule, with a range of [0.0, 1.0]. 0 indicates default value (i.e. 1.0). | ||
default_value: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(will double check this and update here)
The API behavior is that specifying 0, or not specifying a value at all, both cause the system to interpret the threshold as 1.0.
The API Read behavior however should return whatever the user specified --- i.e. 0 / unspecified.
- !ruby/object:Api::Type::NestedObject | ||
name: 'nonNullExpectation' | ||
description: | | ||
ColumnMap rule which evaluates whether each column value is null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another example of an empty message in the API; We may add future fields / properties to this message later, but it's currently empty;
Whether each value needs to be strictly lesser than ('<') the maximum, or if equality is allowed. | ||
Only relevant if a maxValue has been defined. Default = false. | ||
- !ruby/object:Api::Type::NestedObject | ||
name: 'nonNullExpectation' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe this is an empty message, so it should be fine?
description: | | ||
A regular expression the column value is expected to match. | ||
- !ruby/object:Api::Type::NestedObject | ||
name: 'uniquenessExpectation' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe this is an empty message, so it should be fine?
Whether the rule passed or failed. | ||
- !ruby/object:Api::Type::String | ||
name: 'evaluatedCount' | ||
default_values: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
mmv1/products/dataplex/Datascan.yaml
Outdated
description: | | ||
The range denoted by values of an incremental field | ||
properties: | ||
- !ruby/object:Api::Type::string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
display_name = "Test Datascan" | ||
|
||
data = { | ||
entity = "projects/<%= ctx[:test_env_vars]['project_name'] %>/locations/us-central1/lakes/amandeep-dev-lake/zones/rawzone/entities/austin_311" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to make sure I understand, we have to create a datascan against a source
are you suggesting we create a new data source (entity) and then reference that as the entity in the datascan
(or possibly reference a public entity)?
Here, it looks like there's a hard-coded entity, which the tests wont have reference to during automated tests, and hence will fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments @megan07! I left a few questions.
@@ -5,7 +5,7 @@ resource "google_dataplex_datascan" "<%= ctx[:primary_resource_id] %>" { | |||
display_name = "Test Datascan" | |||
|
|||
data = { | |||
entity = "projects/<%= ctx[:test_env_vars]['project_name'] %>/locations/us-central1/lakes/amandeep-dev-lake/zones/rawzone/entities/austin_311" | |||
entity = "projects/dataplex-clouddq/locations/us-central1/lakes/amandeep-dev-lake/zones/rawzone/entities/austin_311" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not sure we want to make this change, we want to keep the ctx[:test_env_vars]['project_name']
and probably do something similar with locations, lakes, zones, and entities. Does that make sense? I'm imagining
"projects/<%= ctx[:test_env_vars]['project_name'] %>/locations/ctx[:test_env_vars]['region']/lakes/ctx[:vars]['lake']/zones/ctx[:vars]['zone']/entities/ctx[:vars]['entity']"
where region
is defined in the test_env_vars
and lake
, zone
and entity
are defined nearby, but in a similar vars
block.
Let me know if you have questions around that! Thanks!
@saurabh-net (edit: referenced the wrong user, oops!) sorry! i must have missed this comment yesterday in my review.
yes, i supposed if this is a hard-coded public entity, that will suffice. i'm not super familiar with the resource to say which is better, but typically, if possible in terraform, we'd create the entity to reference and then add that here.
that's the concern, we want our tests to be able to access the entity referenced here |
* Configuration for traffic director Mesh resource. * Add more tests for Mesh resource * Use new provider for test * Configuration for service binding. * Add hand written test for service binding. * Revert "Add hand written test for service binding." This reverts commit 367449a. * Update service binding yaml. * reduce timeouts. * Update mmv1/products/networkservices/ServiceBinding.yaml Co-authored-by: Sam Levenick <[email protected]> * Service binding update test. * Fix compile error. * Mark the Service Binding resource immutable. * Add configuration for server tls policy. * Remove extra files. * Add tests. * Revert "Remove extra files." This reverts commit 0791d44. * Update mmv1/third_party/terraform/tests/resource_network_security_server_tls_policy_test.go.erb Co-authored-by: Sarah French <[email protected]> * Address PR comments. * Server Tls Policy * Address PR comments. * Address PR comments. --------- Co-authored-by: Madhura Phadnis <[email protected]> Co-authored-by: Sam Levenick <[email protected]> Co-authored-by: Sarah French <[email protected]>
…oogleCloudPlatform#7858) * Add ICEBERG as a supported external data format for bigquery tables * updated descriptions to link api reference
…dPlatform#7895) Co-authored-by: James Edouard <[email protected]>
* Fix path for yaml * Use config * Add container to call workflow * Apply suggestions from code review Co-authored-by: Scott Suarez <[email protected]> --------- Co-authored-by: Scott Suarez <[email protected]>
* Move files to tpgiamresource package * Add docs * Use type alias
* fixed accesscontextmanager spec requirements fixes hashicorp/terraform-provider-google#14612 * fixed google_access_context_manager_service_perimeter as well
Hi @shourya116 , would you please |
Hey @megan07 @shuyama1 @shourya116 , We can move this PR to draft mode for now. |
Superseded by #8010 |
Adding a new Dataplex resource: Datascan.
If this PR is for Terraform, I acknowledge that I have:
make test
andmake lint
in the generated providers to ensure it passes unit and linter tests.Release Note Template for Downstream PRs (will be copied)