Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

queue_depth default not working #1344

Closed
mrkwtz opened this issue Mar 14, 2022 · 3 comments · Fixed by #1345
Closed

queue_depth default not working #1344

mrkwtz opened this issue Mar 14, 2022 · 3 comments · Fixed by #1345

Comments

@mrkwtz
Copy link

mrkwtz commented Mar 14, 2022

Describe the bug
It seems like the storage.trace.pool.queue_depth default config in https://github.com/grafana/tempo/blob/v1.3.2/tempodb/pool/config.go#L12 which should be 10k is not applied. When we don't set it, it seems to default to 200 and causes a lot of error querying store in Querier.FindTraceByID: queue doesn't have room for X jobs errors.

That is our tempodb_work_queue_max metric value when we don't override the defaults
image

That is our tempodb_work_queue_max metric value when we set storage.trace.pool.queue_depth explicitly to 10k
image

As soon as we set the configuration explicitly the error querying store in Querier.FindTraceByID: queue doesn't have room for X jobs errors disappear.

Expected behavior
storage.trace.pool.queue_depth should default to 10k as documented

Environment:

  • Tempo version: 1.3.2
  • Infrastructure: Kubernetes
  • Deployment tool: helm
@joe-elliott
Copy link
Member

Thanks for the issue! The actual default config is set here:
https://github.com/grafana/tempo/blob/main/modules/storage/config.go#L83

Although our docs say the defaults are the values from the code that you linked:
https://github.com/grafana/tempo/blob/main/docs/tempo/website/configuration/_index.md?plain=1#L643

So either we need to fix the defaults or fix the docs. Do you have strong feelings either way on what the defaults should be?

@mrkwtz
Copy link
Author

mrkwtz commented Mar 14, 2022

Ahh good to know :) I would vote for fixing the defaults. Like I said with the 200 default we continuously run into the above mentioned errors and I would say our setup is not that big. If the size of the queue_depth does not cause any overhead I guess 10k should be fine

@joe-elliott
Copy link
Member

Yeah, I agree that that default has not aged well. I appreciate you bringing this to our attention. The fix is in!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants