-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Design Proposal] TEP-0094: Specifying Resource Requirements at Runtime #560
Conversation
/assign |
/assign @wlynch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the idea! Just a few follow ups.
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the very thorough proposal @lbernick !!
My main thoughts:
- I'd like to use named objects instead of a map of strings
- I'd like to avoid introducing the
*
syntax if we can - using the named objects I think gives us a bit of flexibility here - I prefer the option where the feature is "container overrides" vs. just "resource overrides", but only supporting resource overloading for now
[Knative Serving conformance](https://github.com/knative/specs/blob/main/specs/serving/knative-api-specification-1.0.md#container) | ||
but not for | ||
[Tekton pipelines conformance](https://github.com/tektoncd/pipeline/blob/main/docs/api-spec.md#step). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice find!
instead of having the `Container` API embedded. | ||
However, this would be a major API change at this point, | ||
for little gain compared to the proposed solution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to elaborate a bit on the "little gain"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, just spit balling
im also not clear on whether this change would solve the problem completely - maybe it makes sense in some cases to specify resource limits at authoring time as well and you'd still need to override them?
e.g. esp for scenarios within a particular organization, it might make more sense to be able to assume certain resource constraints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few thoughts but I don't have many specific examples. It seems like resource requirements is the primary way this causes friction at the moment.
of modifying resource requirements of catalog `Task`s. | ||
While catalog `Task` owners can add resource requirement parameters to their `Task`s, | ||
this clutters `Task`s, and not all `Task`s may be updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice analysis! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thank you!
I have a couple of questions but nothing major / blocking.
About unnamed steps, I initially though indexes could be an option, but perhaps it's better to avoid that. What we could do it to recommend setting a step name for tasks in the catalog, to avoid have tasks in the catalog that cannot be fully tuned.
About overriding the catch-all setting, that's something we could also add later.
/approve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting for all step is an interesting and tricky behavior to tackle, and might be confusing to some users. Applying 1CPU/1G to a Task that has 10 steps will mean the task will request at least 10CPU/10G (not counting init containers and/or sidecars). This will have to be very well documented that it is per-step and all steps are summed.
Overall looks good to me. I wonder however how much users want this vs a "Task resource request/limits", but I don't think this would get in the way of such a thing in the future.
and a mapping of `Sidecar` names to overrides. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the similar problem that TaskRunSpec
, … have in PipelineRun/Pipeline specs : you have to know the "shape" of the task you are using. The "small" problem with that is, your are tied to the Task definition. If the definition is updated and the step changed, you TaskRun
/ PipelineRun
will fail.
abstractions from their implementation (a `Container`) and allows Tekton full control | ||
over what fields are specified at authoring time vs runtime. | ||
However, this would be a major API change at this point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: this can (and should ?) be done carefully between v1beta1 and v1 if need be (by shrinking what we "show" to the world).
Thanks! definitely agree. |
886603f
to
e4a9ed3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some small naming bikeshedding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @lbernick 🎉
/hold I looked into tektoncd/pipeline#2986 (unrelated to the issues listed in this proposal) last week and realized that the way we address that issue may affect our design for runtime configuration for resources. I'm going to update the problem statement for this TEP to address that issue as well. |
This commit expands the scope of TEP-0094 to cover the user experience of specifying resource requests and limits in Tasks. Focusing only on Step and Sidecar resource requirements may be too narrow of a scope for this TEP. This is largely motivated by tektoncd/pipeline#2986, because the solution to this problem may involve removing the ability to specify Step resource requests. It doesn't make sense to override Step resource requests in TaskRuns if users shouldn't be able to specify Step resource requests in the first place. The scope is also expanded to include parameterizing resource requests based on discussion in tektoncd#560, around treating resource requirement parameterization and runtime overrides as "both/and", rather than "either/or". Fixing Task's resource requirement UX may allow us to get parameterization for free.
This commit expands the scope of TEP-0094 to cover the user experience of specifying resource requests and limits in Tasks. Focusing only on Step and Sidecar resource requirements may be too narrow of a scope for this TEP. This is largely motivated by tektoncd/pipeline#2986, because the solution to this problem may involve removing the ability to specify Step resource requests. It doesn't make sense to override Step resource requests in TaskRuns if users shouldn't be able to specify Step resource requests in the first place. The scope is also expanded to include parameterizing resource requests based on discussion in tektoncd#560, around treating resource requirement parameterization and runtime overrides as "both/and", rather than "either/or". Fixing Task's resource requirement UX may allow us to get parameterization for free.
re-scoping in #588 |
Going to re-open this proposal as the behavior that led me to rescope has already been addressed -- see #588 (comment) for more detail. /hold cancel |
Add design details and alternative solutions for updating the `TaskRun` API to allow users to specify resource requirements of `Task` `Step`s and `Sidecar`s. These changes apply to both one-shot `TaskRun`s and those launched via `PipelineRun`s.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: afrittoli, bobcatfish, jerop, vdemeester, wlynch The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
Add design details and alternative solutions for updating the
TaskRun
API to allow users to specify resource requirements
of
Task
Step
s andSidecar
s. These changes apply to bothone-shot
TaskRun
s and those launched viaPipelineRun
s./kind tep