Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent race condition in flush methods #799

Merged
merged 1 commit into from
Jan 25, 2016

Conversation

ruflin
Copy link
Contributor

@ruflin ruflin commented Jan 21, 2016

No description provided.

@@ -112,6 +113,12 @@ func (s *Spooler) Stop() {

// flush flushes all event and sends them to the publisher
func (s *Spooler) flush() {

// Prevents race in flush method
var mutex = &sync.Mutex{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If flush() creates a new Mutex on each invocation it can't really protect against a data race. The mutex should be part of the Spooler.

I'm not sure what piece of data is causing the race. The line numbers in the trace didn't help much because it says line 140 but the file doesn't even have 140 lines (weird?). If the data race is caused by multiple goroutines reading and writing to/from s.nextFlushTime then the mutex will need to be locked around the read inside the Run() method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh Makes sense. I now made the mutex part of the object. Could you try if that solved it the problem? If not, I will modify it again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check this PR's AppVeyor test output for Filebeat to confirm that the data race is fixed. Just have to wait ⌚ for AppVeyor to finished.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data race still exists. http://pastebin.com/raw/1vrS2Xj7 (from the appveyor logs)

@ruflin ruflin force-pushed the flush-race branch 2 times, most recently from 70c642e to 1665cf7 Compare January 22, 2016 08:31
@@ -51,7 +51,7 @@ build: $(GOFILES)
# Create test coverage binary
.PHONY: buildbeat.test
buildbeat.test: $(GOFILES)
go test -c -covermode=atomic -coverpkg ${GOPACKAGES_COMMA_SEP}
go test -c -race -covermode=atomic -coverpkg ${GOPACKAGES_COMMA_SEP}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This slows down make test a bit. For example, in Packetbeat, make test takes 119 seconds compared to 74 in master (best out of two). It's not a huge increase but I'm thinking we should have a follow up to make system tests be able to run with the non-testing binary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't even attend it to add it as part of this PR. I suggest I make it a variable which is off by default, but for example enabled for our daily builds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already made the -race off by default and it can be enabled via aRACE_DETECTOR=1 environment variable. See #789 or the commit andrewkroh@0aac404.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove the change from this PR as it will be done in #789

andrewkroh added a commit that referenced this pull request Jan 25, 2016
Prevent race condition in flush methods
@andrewkroh andrewkroh merged commit ba80aa2 into elastic:master Jan 25, 2016
@ruflin ruflin deleted the flush-race branch January 25, 2016 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants