Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add user-configurable overrides module #2543

Merged
merged 41 commits into from
Jul 18, 2023

Conversation

yvrhdn
Copy link
Member

@yvrhdn yvrhdn commented Jun 6, 2023

What this PR does:
Introduces a new optional feature: user-configurable overrides. Goal is to allow tenants to change a subset of overrides using an API. Changes will be limited to forwarders and later several metrics_generator_... overrides.

Architecture

For every tenant we will store an overrides.json file in a bucket. These files will be stored in a separate bucket from the backend.

Overrides bucket
├── 1
│   └── overrides.json
└── 2
    └── overrides.json

This bucket is regularly polled and a copy of the limits is kept in-memory. When requesting the overrides for a tenant, the overrides module will now:

  1. check this override is set in the user-configurable overrides, if so return that value
  2. check if this override is set in the runtime config (configmap), if so return that value
  3. return the default values

API

GET /api/overrides
{
  forwarders: ["k6-cloud-insights-staging"],
}
PUT /api/overrides
{
  forwarders: ["k6-cloud-insights-staging"],
}
DELETE /api/overrides

Later we might consider adding a PATCH endpoint to perform partial updates/deletes.

Which issue(s) this PR fixes:
Fixes # New feature!

Checklist

@electron0zero electron0zero force-pushed the kvrhdn/user-configurable-overrides branch from 427039b to 0f989dc Compare June 14, 2023 13:43
@electron0zero electron0zero force-pushed the kvrhdn/user-configurable-overrides branch from 0f989dc to 157b596 Compare June 15, 2023 21:47
electron0zero and others added 6 commits June 29, 2023 02:51
- clean up integration tests for overrides
- rename ReloadInterval -> PollInterval
- linting
@electron0zero electron0zero marked this pull request as ready for review July 4, 2023 14:05
@electron0zero electron0zero self-requested a review July 4, 2023 14:05
Koenraad Verheyden added 2 commits July 13, 2023 14:50
@yvrhdn yvrhdn requested review from joe-elliott and zalegrala July 13, 2023 13:18
Copy link
Contributor

@zalegrala zalegrala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. A few TODOs which we discussed and can follow up in future PRs.

Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment on why I still think we should move the overrides client into pkg/httpclient, but I'll leave that up to you. Your call!

@yvrhdn yvrhdn merged commit 3b618cc into grafana:main Jul 18, 2023
@yvrhdn yvrhdn deleted the kvrhdn/user-configurable-overrides branch July 18, 2023 10:52
@yvrhdn yvrhdn added backport-r105 type/bug Something isn't working and removed backport-r105 labels Jul 18, 2023
@yvrhdn
Copy link
Member Author

yvrhdn commented Jul 18, 2023

Not a bug but that's what the backport job expects 🤷🏻

@github-actions
Copy link
Contributor

Hello @kvrhdn!
Backport pull requests need to be either:

  • Pull requests which address bugs,
  • Urgent fixes which need product approval, in order to get merged,
  • Docs changes.

Please, if the current pull request addresses a bug fix, label it with the type/bug label.
If it already has the product approval, please add the product-approved label. For docs changes, please add the type/docs label.
If the pull request modifies CI behaviour, please add the type/ci label.
If none of the above applies, please consider removing the backport label and target the next major/minor release.
Thanks!

github-actions bot pushed a commit that referenced this pull request Jul 18, 2023
* Add user-configurable overrides module

* Add /api/overrides and fix crash on boot

* Add overridesHandler and WriteStatusRuntimeConfig

* return json and only return overrides for the tenant

* Implement delete

* Fix test I think?

* Fix tests

* clean up handler and TODOs

* Add tests for overridesHandler

* Add e2e test

* Refactor:
- clean up integration tests for overrides
- rename ReloadInterval -> PollInterval
- linting

* Linting

* address more Linting

* fix lint and add test for PATCH

* fix lint error unparam

* remove todo

* use tenantLimits as return type

* Add prometheus.Collector to overrides.Interface

* Test tempo_overrides_user_configurable_overrides_fetch_total metric in e2e tests

* Sprinkle in some tracing

* Update CHANGELOG.md

* Rename loop to running for consistency

* Have mux handle method routing; split up GET, POST and DELETE handlers

* Use built in contains

* Split up user-configurable overrides manager, api and backend client

* Move overrides API to httpclient

* Clean up, linting, fmt

* Remove version field from json

* Address review comments

* If overrides.json does not exist, properly delete it from cache

* Add config warning for conflicting storage

* Check in my tests as well

* Simplify API handler, return 404 on overrides not found

* Typo, linting, fix test

* Use backend constants

---------

Co-authored-by: Suraj Nath <[email protected]>
(cherry picked from commit 3b618cc)
yvrhdn pushed a commit that referenced this pull request Jul 18, 2023
* Add user-configurable overrides module

* Add /api/overrides and fix crash on boot

* Add overridesHandler and WriteStatusRuntimeConfig

* return json and only return overrides for the tenant

* Implement delete

* Fix test I think?

* Fix tests

* clean up handler and TODOs

* Add tests for overridesHandler

* Add e2e test

* Refactor:
- clean up integration tests for overrides
- rename ReloadInterval -> PollInterval
- linting

* Linting

* address more Linting

* fix lint and add test for PATCH

* fix lint error unparam

* remove todo

* use tenantLimits as return type

* Add prometheus.Collector to overrides.Interface

* Test tempo_overrides_user_configurable_overrides_fetch_total metric in e2e tests

* Sprinkle in some tracing

* Update CHANGELOG.md

* Rename loop to running for consistency

* Have mux handle method routing; split up GET, POST and DELETE handlers

* Use built in contains

* Split up user-configurable overrides manager, api and backend client

* Move overrides API to httpclient

* Clean up, linting, fmt

* Remove version field from json

* Address review comments

* If overrides.json does not exist, properly delete it from cache

* Add config warning for conflicting storage

* Check in my tests as well

* Simplify API handler, return 404 on overrides not found

* Typo, linting, fix test

* Use backend constants

---------

Co-authored-by: Suraj Nath <[email protected]>
(cherry picked from commit 3b618cc)

Co-authored-by: Koenraad Verheyden <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants