Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add alternative prometheus remote write implementation to prometheuesremotewriteexporter #37284

Open
mattdurham opened this issue Jan 17, 2025 · 8 comments

Comments

@mattdurham
Copy link

Component(s)

exporter/prometheusremotewrite

Is your feature request related to a problem? Please describe.

For Grafana Alloy I have been working on an alternative approach to the prometheus remote write WAL.

Describe the solution you'd like

Would it be reasonable to add the this new remote implementation into prometheusremotewriteexporter, the config options would be significantly different. Than the current WAL config so would need to only declare one or the other. Its currently experimental in Alloy but we have been using it internally and alongside several other users.

We haven't had any issues scaling it to a few million series writes a second, also looking into support for the new RW format that would mean only updating the library along with general support.

The underlying code can be found at https://github.com/grafana/walqueue

Describe alternatives you've considered

No response

Additional context

No response

@mattdurham mattdurham added enhancement New feature or request needs triage New item requiring triage labels Jan 17, 2025
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@dashpole
Copy link
Contributor

A few questions/points:

  • What are the benefits of the new WAL compared to the existing WAL implementation? We should spell those out clearly.
  • We should not maintain multiple WAL implementations long-term. So if we plan to adopt a new WAL, we should plan for it to replace the existing WAL implementation.
  • The above means we need a migration plan. If someone updates their collector version without changing configuration, we should not drop telemetry stored in the "legacy" WAL.
  • I the WAL was initially implemented because we needed in-order points and the exporter helper's persistent queue doesn't preserve order.
    • Given Prometheus has started supporting OOO samples, do we still need a WAL? (I think the answer is yes, but we should think this through.
    • Should the WAL be added to exporterhelper?

The effort to add support for PRW 2.0 is being worked on by @jmichalek132. I would sync up with him on that. If we can keep that effort decoupled from the WAL implementation, that would be preferable.

@dashpole dashpole removed the needs triage New item requiring triage label Jan 17, 2025
@mattdurham
Copy link
Author

I was trying to performance benchmark use the existing wal implementation versus the grafana walqueue. Ran into several issues getting it to work properly, trying to figure out if its my tests or underlying code issues. If I turn off the wal then the benchmark runs fine. My quick and dirty code is here. It's not optimized for otel ingestion but likely good enough for rough comparison.

I saw some issues where it seems to either continually send the same data if no new data is written and/or resets the index. Will spend some time to see if I can get an apples to apples comparison.

Will write up a longer form has I dig into the code with some of the differences and pros/cons.

Thank you for the response!

@mattdurham
Copy link
Author

Might be worthy of a separate thread but I noticed the exportThenFrontTruncateWAL after the request is exported, then the retrieveWALIndices the wal is closed then reopened and the indices are set. This appears to cause issues. I added a print statement and can see it rolls back the rIndex to a previous number which feels wrong. r index was 11 now is 10

@mattdurham
Copy link
Author

mattdurham commented Jan 20, 2025

Pros and Cons of Grafana Wal Queue

Pros

  • Alloy uses and it will transition to the default prometheus remote write. This means that it will be kept up to date and will be heavily used and tested. Along with Grafana internally meaning bugs, issues and CVEs will be handled quickly.
  • Will be kept up to date.
  • Will be performant and handle large scale. For instance going to be adding elastic scaling based on metrics, incoming flow and outgoing flow soon.
  • Things like OOO samples are handled natively along with native histgrams, though unsure yet if that maps.
  • It is not inherently in Alloy but is a separate repository to keep dependencies minimal.

Cons

  • Its another dependency.
  • Requires manual updating for the dependency.
  • Its more complex, hopefully this is hidden by a simple api layer but if underneath its harder.
  • Config mapping is always a pain.

Overview

The Wal Queue is built to be a more generic Prometheus Metric replacement. The previous Prometheus WAL has tight coupling between the scrape, wal and remote write that make certain uses cases hard. It also has issues at high cardinality with memory increase. The WAL Queue is meant to avoid those in a way not to different in concept from the current employed Queue. It uses a more push based instead of polling based system to reduce CPU, but fundamentally it writes to a disk queue and then deseralizes that disk queue.

There is support for multiple underlying data formats, the disk writes are wrapped in a a format that adds a metadata dictionary for things like file format, compression, number of records and other fields. This means that on disk file format or compression config change can be made but it will work with whatever is written on disk since the metadata will determine how to deserialize the format.

It is written in a way to minimize allocs, memory and CPU pressure and fully supports:

  • TTL
  • Replayability
  • Several Auth Schemes
  • Autoscaling (Coming soon)

In addition it handles the writes very similar to the traditional prometheus remote write in terms of 5xx, 429s, retry after header, round robin dialer and so on.

I am personally more than happy to be involved with the updating and maintenance of the dependency, configuration and issues related to it.

@mattdurham
Copy link
Author

For specific answers:

  1. I think we can transition relatively easily, either via a new config section or possibly mapping the config along with some new optional configuration. If need be we can read the old wal and import it into the new wal since its prompb.writerequest.
  2. WAL is good for handling network downtime or restart replayability so its still needed.
  3. Unsure what you mean about the export helper, are you speaking of the prometheus exporter?

@ArthurSens
Copy link
Member

Unsure what you mean about the export helper, are you speaking of the prometheus exporter?

PrometheusRemoteWrite exporter used a library developed in collector-core called exporter-helper to coordinate parallelism, batching, retries, persistency on disk, and other stuff that collector exporters usually need. But since the exporter-helper couldn't maintain datapoint ordering and Prometheus didn't have OOO support in the past, we were forced to use another WAL implementation (the one we've got today).

Given Prometheus has started supporting OOO samples, do we still need a WAL?

I believe so, yes. Even though Prometheus now supports OOO, the performance is worse than in the usual in-order scenario. While exporter-helper doesn't support ordering, using an in-order WAL will be much better for Prometheus.

we should not drop telemetry stored in the "legacy" WAL.

Would it make sense to use feature flags to coordinate the transition? We keep both while we verify the new works well, then replace once it's stable?

@dashpole
Copy link
Contributor

Would it make sense to use feature flags to coordinate the transition? We keep both while we verify the new works well, then replace once it's stable?

I could see something like this. If both are enabled, only the new WAL would be written to, and the old one would only be read from. We can leave it in that state for a few releases before removing the old WAL.

I would really like to get input from @Aneurysm9, since he implemented the previous WAL in #7304 and open-telemetry/opentelemetry-collector#3597.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants