Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add command to measure latency from SNEWS messages #15

Merged

Conversation

myNameIsPatrick
Copy link
Collaborator

@myNameIsPatrick myNameIsPatrick commented Oct 5, 2020

Description

This PR adds a new command, snews latency, to measure the latency produced by the various Kafka topics. Example:

snews latency --env-file dev-config.env -v --measurement observation

This will listen to a topic and compute the current latency as well as a running average over N points (configurable).

One side effect from this was that I found by tuning the batch_size of the stream.read to 1, we can lower the latency considerably from previously.

Copy link
Collaborator

@bfc5288 bfc5288 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@myNameIsPatrick This looks pretty good to me. I have a few questions:

  • Is the intended behavior to display latency only when -v is specified?
  • When computing the mean latency, it looks like only messages received after starting snews latency are tracked -- is this desired? (and/or is it possible/desired to include previous messages in the computation?)

Other comments:

  • I looked for more info about batch_size -- you probably already saw this, but here is info from adc-streaming for bookkeeping. For this instance, I think it makes sense to keep batch_size at 1 here, but we should keep our eyes on this and batch_timeout when we're considering future latency requirements in other applications.
  • Maybe for a later iteration of this command, it'd be nice to find latency filtered by detector in addition to measurement

@myNameIsPatrick
Copy link
Collaborator Author

Thanks for the review @bfc5288.

Is the intended behavior to display latency only when -v is specified?

Yeah, it seems a bit silly to be honest but I wanted the behavior to be consistent across all the various snews tasks, so the default is no output. The alternative would be to change what type of logging messages are set by default (INFO vs WARNING) but that would be different than the other tasks and INFO seems like the right log level to be printing these messages at. If it was extended to push latency metrics to something like influx then I wouldn't necessarily want logging messages to be printed by default. So I don't know what's best here.

When computing the mean latency, it looks like only messages received after starting snews latency are tracked -- is this desired? (and/or is it possible/desired to include previous messages in the computation?)

Yeah, that's intentional. The latency is computed from the reference timestamp being now. If you were to allow reading messages from some time in the past, the calculated latency wouldn't make sense anymore.

I looked for more info about batch_size -- you probably already saw this, but here is info from adc-streaming for bookkeeping. For this instance, I think it makes sense to keep batch_size at 1 here, but we should keep our eyes on this and batch_timeout when we're considering future latency requirements in other applications.

Yeah, definitely. I realize after looking at this that we don't explicitly document any of these keyword arguments which get passed directly to adc-streaming in hop-client. These should be noted with guidelines as to how to tune them for different types of applications. The higher batch_size would be more efficient if trying to process earlier messages from the stream but when minimizing latency it isn't the best when the data rate is low.

Maybe for a later iteration of this command, it'd be nice to find latency filtered by detector in addition to measurement.

Agreed.

@bfc5288
Copy link
Collaborator

bfc5288 commented Oct 6, 2020

Yeah, it seems a bit silly to be honest but I wanted the behavior to be consistent across all the various snews tasks, so the default is no output. The alternative would be to change what type of logging messages are set by default (INFO vs WARNING) but that would be different than the other tasks and INFO seems like the right log level to be printing these messages at. If it was extended to push latency metrics to something like influx then I wouldn't necessarily want logging messages to be printed by default. So I don't know what's best here.

That makes sense. The only reason I'd consider having latency be an exception to the no-print-output default is that in this case, the result of latency would exist only in the logs and may appear as though it's doing nothing for the user (whereas model and generate have their results in the logs and in a topic somewhere).

But I agree that a best solution is not clear at the moment. Unless you have other thoughts on printing the output, I'm fine with keeping it just in the logs, and we can definitely revisit this if there's a use case that needs something else.

Yeah, definitely. I realize after looking at this that we don't explicitly document any of these keyword arguments which get passed directly to adc-streaming in hop-client. These should be noted with guidelines as to how to tune them for different types of applications. The higher batch_size would be more efficient if trying to process earlier messages from the stream but when minimizing latency it isn't the best when the data rate is low.

Agreed; I added a note on this for future bookkeeping https://github.com/orgs/scimma/projects/3#card-46795004.

Maybe for a later iteration of this command, it'd be nice to find latency filtered by detector in addition to measurement.

Agreed.

Great -- I made a card/issue on this that we can address sometime after this PR.

@myNameIsPatrick myNameIsPatrick merged commit f30ec70 into SNEWS2:master Oct 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants