Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics to groupbytraceprocessor, wait for queue to be drained during shutdown. #1842

Conversation

jpkrohling
Copy link
Member

@jpkrohling jpkrohling commented Sep 24, 2020

Fixed deadlock in groupbytrace processor.

Link to tracking Issue: #1465 and #1811
Testing: unit + manual tests
Documentation: see README.md

Signed-off-by: Juraci Paixão Kröhling [email protected]

@jpkrohling jpkrohling force-pushed the jpkrohling/1465-shutdown-behavior-for-groupbyprocessor branch from ea29c21 to 2ccd3d8 Compare September 24, 2020 13:41
@codecov
Copy link

codecov bot commented Sep 24, 2020

Codecov Report

Merging #1842 into master will increase coverage by 0.00%.
The diff coverage is 95.23%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master    #1842    +/-   ##
========================================
  Coverage   91.23%   91.24%            
========================================
  Files         272      273     +1     
  Lines       16263    16383   +120     
========================================
+ Hits        14838    14949   +111     
- Misses        998     1003     +5     
- Partials      427      431     +4     
Impacted Files Coverage Δ
processor/groupbytraceprocessor/storage_memory.go 90.90% <77.27%> (-9.10%) ⬇️
processor/groupbytraceprocessor/event.go 96.61% <96.22%> (-3.39%) ⬇️
processor/groupbytraceprocessor/metrics.go 100.00% <100.00%> (ø)
processor/groupbytraceprocessor/processor.go 100.00% <100.00%> (ø)
service/telemetry.go 82.25% <100.00%> (+0.29%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 645eab3...6d01199. Read the comment docs.

@jpkrohling jpkrohling force-pushed the jpkrohling/1465-shutdown-behavior-for-groupbyprocessor branch 3 times, most recently from 9ecf89f to 793b1e2 Compare September 24, 2020 14:30
@jpkrohling
Copy link
Member Author

The last commit in this PR fixes #1811.

@jpkrohling jpkrohling force-pushed the jpkrohling/1465-shutdown-behavior-for-groupbyprocessor branch from 6277807 to be3e69b Compare September 24, 2020 20:07
@jpkrohling jpkrohling marked this pull request as draft September 25, 2020 09:44
@jpkrohling jpkrohling force-pushed the jpkrohling/1465-shutdown-behavior-for-groupbyprocessor branch 2 times, most recently from c037b65 to 99f90b5 Compare September 25, 2020 11:53
* Drain the queue upon shutdown, with a time limit. Fixes open-telemetry#1465.
* Added metrics to the groupbyprocessor, making it easier to understand what's going on in case of problems. See open-telemetry#1811.
* Changes the in-memory storage to unlock its RLock when the method returns. Fixes open-telemetry#1811.

Link to tracking Issue: open-telemetry#1465 and open-telemetry#1811
Testing: unit + manual tests
Documentation: see README.md

Signed-off-by: Juraci Paixão Kröhling <[email protected]>
@jpkrohling jpkrohling force-pushed the jpkrohling/1465-shutdown-behavior-for-groupbyprocessor branch from 99f90b5 to afcb27a Compare September 25, 2020 12:00
@jpkrohling jpkrohling marked this pull request as ready for review September 25, 2020 12:14
@jpkrohling
Copy link
Member Author

@tigrannajaryan, @bogdandrutu , would one of you please review this, or assign someone to review it? I just discussed this with @chris-smith-zocdoc and he confirmed that this fixes the issue he described. A couple of separate issues were found, but we'll address that after the move of this processor to the -contrib repository.

I would really like to get this merged before the move, so that I have only one thing to move.

@@ -39,9 +37,6 @@ func TestDefaultConfiguration(t *testing.T) {
func TestCreateTestProcessor(t *testing.T) {
c := createDefaultConfig().(*Config)

logger, err := zap.NewDevelopment()
require.NoError(t, err)

params := component.ProcessorCreateParams{
Logger: logger,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not apparent to me what this logger is now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is defined in the processor_test.go as:

var (
	logger, _ = zap.NewDevelopment()
)

Copy link
Contributor

@rmfitzpatrick rmfitzpatrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with a couple nits. Thank you for the very helpful readme.

Signed-off-by: Juraci Paixão Kröhling <[email protected]>
@jpkrohling
Copy link
Member Author

jpkrohling commented Oct 1, 2020

PR updated with the corrected number of buffered events.

@tigrannajaryan tigrannajaryan merged commit 56e2121 into open-telemetry:master Oct 1, 2020
MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this pull request Nov 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock in groupbytrace processor Shutdown behavior for groupbyprocessor
3 participants