-
Notifications
You must be signed in to change notification settings - Fork 53
Support gRPC config for agent-service plugin #368
Conversation
Codecov Report
@@ Coverage Diff @@
## master #368 +/- ##
==========================================
+ Coverage 63.00% 64.36% +1.36%
==========================================
Files 154 154
Lines 13030 10605 -2425
==========================================
- Hits 8209 6826 -1383
+ Misses 4208 3163 -1045
- Partials 613 616 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
|
67ef7a7
to
f3662b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@honnix can you elaborate on what is not backwards compatible?
Both |
@pingsutw we released flyte agents as an experimental feature right? so no guarantees of backwards compatibility. IMO fixing this grpc configuration issue warrants a backwards incompatible merge. thoughts? |
Agreed. This feature is tagged as experimental, so let's use this time to break backwards compatibility if needed (which seems like this particular instance is one of such times). |
@honnix thinking about this it will be unable to cover the scenario where there are two flyte agent deployments where each satisfies multiple task types right? I'm thinking something like:
the current approach would create a separate gRPC connection for each task type right? whereas it would probably be better to reuse a single connection for each endpoint. Thoughts? |
I don't think so. The task type here is only for routing and there should be only one grpc connection per each endpoint. This PR shouldn't change how that part works I think. |
So the EndpointForTaskTypes map is keyed on the task type, but the connectionCache is keyed on the endpoint. So you can't have two endpoints configured differently for different task types. For example, differing timeouts will result in unintended behavior:
IIUC the above would only create a single gRPC connection in the background (with timeout of whatever agent was called first) and reuse it for calls to both task types. I don't think this is a huge issue, but I don't like the unintended behavior for future added gRPC params that might be more important. |
@hamersaw I see what you mean. I thought about that but I was not sure per task type endpoint configuration is a real use case or a bit of overthinking. My interpretation of this config is mostly about grouping, e.g. task type |
@hamersaw I did some experiment in d67a015 and now the config is like: endpointForTaskTypes:
foo: agent
bar: bar_agent
foo_bar: agent
grpcEndpoints:
agent:
endpoint: localhost:8088
timeouts:
CreateTask: 1s
bar_agent:
endpoint: localhost:8088
timeouts:
CreateTask: 2s And the cache is keyed on the whole endpoint config (as a pointer). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to update the naming here? Ex grpcEndpoints
-> agentEndpoints
and endpointForTaskTypes
-> agentsForTaskTypes
or taskTypeAgents
?
Also, just want to make sure this doesn't introduce additional confusion. We still need to add all supported plugins to the supportedTaskTypes
configuration right?
SGTM but maybe it's better double check with the original author of the config section.
Yes that is correct. This PR does not make any change to the semantics of |
Signed-off-by: Hongxin Liang <[email protected]>
Signed-off-by: Hongxin Liang <[email protected]>
Signed-off-by: Hongxin Liang <[email protected]>
Signed-off-by: Hongxin Liang <[email protected]>
Signed-off-by: Hongxin Liang <[email protected]>
@hamersaw I did some experiments in the latest commit. PTAL, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@honnix Thank you so much
Thank you all for reviewing and providing great comments! No worries about nits at all. If we are gonna ship a breaking change, it's better we do it right. |
@pingsutw I tried that in flyteorg/flytesnacks#1073. Also I propose a small fix in #381 for better naming alignment. PTAL. Thank you. |
* Support gRPC config for agent-service plugin Signed-off-by: Hongxin Liang <[email protected]> * Address comments Signed-off-by: Hongxin Liang <[email protected]> * No deadline if timeout is 0 Signed-off-by: Hongxin Liang <[email protected]> * Per task type grpc endpoint config Signed-off-by: Hongxin Liang <[email protected]> * Rename config items according to comments Signed-off-by: Hongxin Liang <[email protected]> --------- Signed-off-by: Hongxin Liang <[email protected]>
TL;DR
Support customized gRPC config for agent-service plugin.
Type
Are all requirements met?
Complete description
Support configuring gRPC endpoint with:
Details can be found in flyteorg/flyte#3823
Note that this is a breaking change! I'm not sure how mature the
agent-service
is and how many users are using it, but if backward compatibility is important in this case, I can go an extra mile to achieve that.Tracking Issue
Closes flyteorg/flyte#3823
Follow-up issue
NA