Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve model service autoscaling #3496

Open
achimnol opened this issue Jan 17, 2025 — with Lablup-Issue-Syncer · 0 comments
Open

Improve model service autoscaling #3496

achimnol opened this issue Jan 17, 2025 — with Lablup-Issue-Syncer · 0 comments

Comments

@achimnol
Copy link
Member

achimnol commented Jan 17, 2025

Motivation  

Let’s follow-up some incomplete corners of https://lablup.atlassian.net/browse/BA-96

�Expected Sub Issue

  1. Define clearer priority semantics when there are multiple rules to be triggered at the same time. Currently only the first matched rule is evaluated, but if there are multiple rules observing different metrics, they need to be evaluated in a single iteration and somehow the results must be combined.
    1. We could consider having a more sincerely designed validation of autoscaling rules for a single endpoint. For instance, only a single pair of increasing/decreasing rules may exist against a single metric. If so, we could group the rules by metrics and evaluate each group simultaneously, and prioritize their results using a configured order.
  2. Support additional aggregation operators when collecting metrics from multiple replica sessions and kernels, as currently we have only “average”. (e.g., min, max)
    1. Like idle checkers, we need to consider having time-based, windowed metric smoothing.
    2. Users would want to have a GUI to see the current metrics.
  3. Leave user-queryable explicit audit logging of the scaling decisions.
  4. Consider adding the endpoint-level cool-down, in addition to individual rules.
  5. Allow disabling a specific autoscaling rule without deleting it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant