-
Notifications
You must be signed in to change notification settings - Fork 53
Add BigQuery plugin #161
Add BigQuery plugin #161
Conversation
Codecov Report
@@ Coverage Diff @@
## master #161 +/- ##
==========================================
+ Coverage 60.00% 60.36% +0.36%
==========================================
Files 131 135 +4
Lines 6966 7292 +326
==========================================
+ Hits 4180 4402 +222
- Misses 2361 2447 +86
- Partials 425 443 +18
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
650beec
to
02d23f7
Compare
b626f25
to
4b94f6f
Compare
// For standard SQL queries, this flag is ignored and large results are | ||
// always allowed. However, you must still set destinationTable when | ||
// result size exceeds the allowed maximum response size. | ||
AllowLargeResults bool `json:"allowLargeResults,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be input to the execution or task definition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh nevermind, i see you are using this as a json input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a great question :), there is no way to interact between task inputs and these fields yet. One option can be to support go templates in some of the fields, for instance: custom: { query: "{{ .inputs.query }} }" }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to model this as a protobuf (plugins) so that we can get the protobufs built automatically?
} | ||
|
||
if createError.Code >= http.StatusInternalServerError { | ||
return core.PhaseInfoFailed(pluginsCore.PhaseRetryableFailure, systemExecutionError, taskInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func PhaseInfoSystemRetryableFailure(code, reason string, info *TaskInfo) PhaseInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change, but there is no similar method for non-retryable failures, that loses symmetry in the code and makes it less readable (in my opinion).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do have a method for non-retryable errors. Just the word retryable
is dropeed -
func PhaseInfoSystemFailure(code, reason string, info *TaskInfo) PhaseInfo { |
options := []option.ClientOption{ | ||
option.WithScopes("https://www.googleapis.com/auth/bigquery"), | ||
// FIXME how do I access current version? | ||
option.WithUserAgent(fmt.Sprintf("%s/%s", "flytepropeller", "LATEST")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably put this in the context...
Didn't have much time to work on it. I'm going through the comments. There is an issue with changing to using Kubernetes client from controller-runtime. It doesn't support sub-resources, and TokenRequest is a sub-resource to ServiceAccount. See kubernetes-sigs/controller-runtime#172. We have REST client that was originally used on top-level initialization code in flytepropeller, so we can potentially propagate it all the way through, or we can keep it as is. As for now, the second option seems more feasible not to bloat this pull request. |
Shall we discuss this today? |
@kanterov Would love to help push this over the boundary. How should we do this? |
5b37cd8
to
b5ebb0f
Compare
@@ -10,7 +10,7 @@ download_tooling: #download dependencies (including test deps) for the package | |||
|
|||
.PHONY: lint | |||
lint: download_tooling #lints the package for common code smells | |||
GL_DEBUG=linters_output,env golangci-lint run --deadline=5m --exclude deprecated -v | |||
GL_DEBUG=linters_output,env golangci-lint run --deadline=7m --exclude deprecated -v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs an update in boilerplate and it will automatically percolate.
But do you need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did that because linting was timing out. I think it's because of added dependencies. Let me try again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments, more like nits
Just one major thing, the config for the plugin. Do we want that to be a protobuf? Or how is the java sdk sending this object today
@kumare3 we have implemented dataclass for Python and auto-value class for Java/Scala. I can also write protobuf for it, but it isn't clear if it worth it. SDK code is already typed, and on wire it becomes proto struct. |
Add default implementation for token exchange for GCP. Signed-off-by: Gleb Kanterov <[email protected]>
Signed-off-by: Gleb Kanterov <[email protected]>
Signed-off-by: Gleb Kanterov <[email protected]>
@kumare3 for protobuf, I started to write it, it will take some time, because there are many nested objects are imported from dependencies. One thing that I noticed is that currently go structs use camelCase for naming, while protobuf messages use snake_case. I'm not sure it's possible to change how protobuf serializes into structs. Underlying Google API uses camelCase, while proto convention is snake_case. Do you think we can use camelCase in proto? |
@kanterov that's ok, task template has versioning, so that you can easily choose the implementation. I see that there is too much work. Let's go ahead with what you have. This might be a different example. I also have another idea, we can get the proto file from the go struct or use open api |
Congrats on merging your first pull request! 🎉 |
Signed-off-by: Gleb Kanterov <[email protected]>
TL;DR
Implements WebAPI plugin for BigQuery.
Type
Are all requirements met?
Complete description
Adds implementation of token exchange mechanism to get Google credentials using workload identity. Implement BigQuery plugin, for now, only to create QUERY jobs, COPY/EXTRACT/LOAD jobs are going to be parts of subsequent pull requests.
Tracking Issue
flyteorg/flyte#817
Follow-up issue
NA