-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent race condition in flush methods #799
Conversation
@@ -112,6 +113,12 @@ func (s *Spooler) Stop() { | |||
|
|||
// flush flushes all event and sends them to the publisher | |||
func (s *Spooler) flush() { | |||
|
|||
// Prevents race in flush method | |||
var mutex = &sync.Mutex{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If flush()
creates a new Mutex on each invocation it can't really protect against a data race. The mutex should be part of the Spooler.
I'm not sure what piece of data is causing the race. The line numbers in the trace didn't help much because it says line 140 but the file doesn't even have 140 lines (weird?). If the data race is caused by multiple goroutines reading and writing to/from s.nextFlushTime
then the mutex will need to be locked around the read inside the Run()
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewkroh Makes sense. I now made the mutex part of the object. Could you try if that solved it the problem? If not, I will modify it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will check this PR's AppVeyor test output for Filebeat to confirm that the data race is fixed. Just have to wait ⌚ for AppVeyor to finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data race still exists. http://pastebin.com/raw/1vrS2Xj7 (from the appveyor logs)
70c642e
to
1665cf7
Compare
@@ -51,7 +51,7 @@ build: $(GOFILES) | |||
# Create test coverage binary | |||
.PHONY: buildbeat.test | |||
buildbeat.test: $(GOFILES) | |||
go test -c -covermode=atomic -coverpkg ${GOPACKAGES_COMMA_SEP} | |||
go test -c -race -covermode=atomic -coverpkg ${GOPACKAGES_COMMA_SEP} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This slows down make test
a bit. For example, in Packetbeat, make test
takes 119 seconds compared to 74 in master (best out of two). It's not a huge increase but I'm thinking we should have a follow up to make system tests be able to run with the non-testing binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't even attend it to add it as part of this PR. I suggest I make it a variable which is off by default, but for example enabled for our daily builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already made the -race
off by default and it can be enabled via aRACE_DETECTOR=1
environment variable. See #789 or the commit andrewkroh@0aac404.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove the change from this PR as it will be done in #789
Prevent race condition in flush methods
No description provided.