Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make UDP-receiver/operator asynchronous & concurrent #27613

Closed
hzahav opened this issue Oct 11, 2023 · 5 comments
Closed

Make UDP-receiver/operator asynchronous & concurrent #27613

hzahav opened this issue Oct 11, 2023 · 5 comments
Assignees

Comments

@hzahav
Copy link

hzahav commented Oct 11, 2023

Component(s)

pkg/stanza, receiver/udplog

Is your feature request related to a problem? Please describe.

TL;DR: In high scale scenarios, UDP-receiver has bursts of data-loss due to its working synchronously and single-threadedly.

In high scale scenarios, it's easy to lose UDP packets If the receiving side slows down even for a very short time (for example, the otel-exporter sends data to an endpoint that shortly takes longer to respond), the sender doesn't get any indication about it (in distinction from TCP), and keeps sending data in the same rate.
During that time, the receiver's network buffer gets full and you get data-loss. Also, if there's a short burst of more data than usual (bigger than usual), it also causes data loss due to same reason.
This happens in part because the current UDP receiver works synchronously. If exporter slows down for even a short time, there's data-loss during high scale scenarios.

Describe the solution you'd like

The UDP-receiver (more accurately, the udp input operator in stanza [stanza\operator\input\udp]) needs to process logs in an asynchronous manner to reduce data-loss and increase processing rate. That's important for high-rate scenarios.
Code is already ready for PR, btw.

Our stress tests indicate that changing the UDP stanza input operator to have 2 go-routines solved continuous data-loss issues (not to mention, increase the processing rate of the otel collector).
a. 1st go routine ('reader') only reads from UDP and puts the data into a channel - no processing is done there at all (including splitting, adding attributes, etc.).
2. 2nd go routine ('processor') reads from that channel, performs the processing offered by the UDP-operator, and pushes into the next otel step (in our case, it would be a batch processor).

It's better to add concurrency to the mix (for example, allow the 'processor' to run with 5 go routines) since our tests indicate it improved processing rate even further. The internal processing in the udp receiver may be a bit complicated, since it involves splitting, adding attributes. It might help some consumers to have multiple such 'processors' routines that work concurrently before sending the data downstream.
This would require a graceful shutdown mechanism that allows the receiver to finish handling the items already read and pushed to the channel, so they can be pushed downstream during shutdown (while stopping the 'reader' routine from reading more items from the UDP port).

The suggested feature allows the customer to "pay" with available memory (which can be much bigger than the max size you can set the network buffer to be) to reduce the risk of data-loss due to these issues. Of course, this won't help if our otel collector can handle X EPS, but consistently gets 1.1X EPS. The intention is only to prevent data loss in scenarios when the otel-collector gets data-rate it's usually able to handle, but has short term latency.
Our tests indicate that using more go-routines here (2+) didn't have a major affect on CPU usage overall (but there was a small one, obviously). Again, it should be the consumer's choice to "pay" with more CPU, to reduce risk of data-loss.

Describe alternatives you've considered

  • Reduce scale of data being sent to each instance - we ran stress tests to find out what are the limits of the otel-collector in our environment, so we know not to send too much data to each instance. Let's say the limit is X. Even when we send 0.8X or 0.7X, we still get data-loss from time to time. Sometimes up to 3-4% of data is just not received by the UDP-receiver. Our metrics indicate that network buffer on those nodes got full and dropped and as a result, dropped the data. If we dramatically reduce the data being sent to 0.4X, we get data-loss much more rarely, but this is a significant waste of resources, and not reasonable in very high scale scenarios.
  • Increase batch-processor max items - didn't help. we're already using a pretty big number.
  • add concurrency to the custom otel-exporter - didn't solve the issue. we're already running with 15+ go routines there. seem to have reached max optimization there.
  • Increase the node's network buffer - we increased (x100) all the relevant kernel buffers (netdev_max_backlog, rmem_max, rmem_default, etc.) - while it did somewhat reduce data-loss, it didn't solve the issue. increasing further didn't help above that. Note that there's a limit to increasing those variables, as these are kernel buffers, so we can't treat them like "regular" memory.
  • persistent-queue helper (https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md#persistent-queue) - this wouldn't help here if the endpoint is still accepting data from our exporter, but just does it a bit more slowly for a few seconds, that alone will cause data-loss in high scale scenarios (as our tests indicate).
  • General UDP processing in high rate tips - we basically tried most of these steps indicated here (https://blog.cloudflare.com/how-to-receive-a-million-packets/), didn't solve the problem completely.
  • Tried to add more pods running otel-collector instances on same node (the network buffer is per node, and not per pod) - didn't solve the problem, sometimes worsened it.
  • Can't have 2 separate pipelines (both reading from same UDP port) on same otel-collector instance - otel blocks that option since 2 receivers can't be configured to read from same port.

Additional context

We have a scenario that requires our otel collector to process high scale data that's read from from a UDP port.
Along with the UDP-receiver with have an otel-batch-processor, and our otel-exporter sends the logs over the network (after being compressed). Our custom otel-exporter is maximally optimized (including using lots of concurrent channels, putting as much data as possible in each network request, compressing, etc.).

@hzahav hzahav added enhancement New feature or request needs triage New item requiring triage labels Oct 11, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@hovavza
Copy link
Contributor

hovavza commented Oct 11, 2023

Created PR - #27620

@hovavza
Copy link
Contributor

hovavza commented Oct 12, 2023

Created PR - #27620

Closed previous PR - will add the changes gradually.
1st step is in following PR - #27647

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 18, 2023
@djaglowski
Copy link
Member

Closing as resolved by #27647 and #28901

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants