You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Define clearer priority semantics when there are multiple rules to be triggered at the same time. Currently only the first matched rule is evaluated, but if there are multiple rules observing different metrics, they need to be evaluated in a single iteration and somehow the results must be combined.
We could consider having a more sincerely designed validation of autoscaling rules for a single endpoint. For instance, only a single pair of increasing/decreasing rules may exist against a single metric. If so, we could group the rules by metrics and evaluate each group simultaneously, and prioritize their results using a configured order.
Support additional aggregation operators when collecting metrics from multiple replica sessions and kernels, as currently we have only “average”. (e.g., min, max)
Like idle checkers, we need to consider having time-based, windowed metric smoothing.
Users would want to have a GUI to see the current metrics.
Leave user-queryable explicit audit logging of the scaling decisions.
Consider adding the endpoint-level cool-down, in addition to individual rules.
Allow disabling a specific autoscaling rule without deleting it.
The text was updated successfully, but these errors were encountered:
Motivation
Let’s follow-up some incomplete corners of https://lablup.atlassian.net/browse/BA-96
�Expected Sub Issue
The text was updated successfully, but these errors were encountered: