Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race Condition in job scheduler #507

Closed
jeremyje opened this issue Jun 2, 2023 · 5 comments · Fixed by #512
Closed

Race Condition in job scheduler #507

jeremyje opened this issue Jun 2, 2023 · 5 comments · Fixed by #512
Labels
bug Something isn't working

Comments

@jeremyje
Copy link

jeremyje commented Jun 2, 2023

Race Condition

package internal

import (
	"fmt"
	"sync"
	"testing"
	"time"

	"github.com/go-co-op/gocron"
)

var (
	cronMutex sync.Mutex
)

// AddCronJob enqueues a cron job into the scheduler.
func AddCronJob(scheduler *gocron.Scheduler, f func() error) error {
	jobSpec := scheduler.Cron("0 0 1 * *").SingletonMode().Tag("test").StartImmediately()
	_, jobErr := jobSpec.Do(func() error {
		cronMutex.Lock()
		defer cronMutex.Unlock()

		if err := f(); err != nil {
			return err
		}
		return nil
	})

	return jobErr
}

func TestAddCronJob(t *testing.T) {
	scheduler := gocron.NewScheduler(time.UTC)
	scheduler.SetMaxConcurrentJobs(1, gocron.RescheduleMode)
	defer scheduler.Stop()

	if err := AddCronJob(scheduler, func() error {
		return fmt.Errorf("Angry job")
	}); err != nil {
		t.Fatal(err)
	}

	i := 0
	ch := make(chan bool, 1)
	if err := AddCronJob(scheduler, func() error {
		i++
		ch <- true
		return nil
	}); err != nil {
		t.Fatal(err)
	}

	scheduler.StartAsync()
	<-ch
	scheduler.Stop()

	if i != 1 {
		t.Errorf("counter mismatch, got: %d want: %d", i, 1)
	}
	close(ch)
}
CGO_ENABLED=1 go test -race internal -count 10

==================
WARNING: DATA RACE
Write at 0x00c00012a448 by goroutine 34:
  github.com/go-co-op/gocron.(*jobFunction).singletonRunner()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:89 +0x2bc
  github.com/go-co-op/gocron.(*executor).runJob.func2()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:167 +0x39

Previous read at 0x00c00012a448 by goroutine 30:
  github.com/go-co-op/gocron.(*executor).runJob()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:169 +0x230
  github.com/go-co-op/gocron.(*executor).limitModeRunner()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:109 +0x1e4
  github.com/go-co-op/gocron.(*executor).run.func2()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:187 +0x39

Goroutine 34 (running) created at:
  github.com/go-co-op/gocron.(*executor).runJob()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:167 +0x219
  github.com/go-co-op/gocron.(*executor).limitModeRunner()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:109 +0x1e4
  github.com/go-co-op/gocron.(*executor).run.func2()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:187 +0x39

Goroutine 30 (finished) created at:
  github.com/go-co-op/gocron.(*executor).run()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:187 +0x3be
  github.com/go-co-op/gocron.(*executor).start.func1()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:130 +0x39
==================
--- FAIL: TestAddCronJob (0.00s)
    testing.go:1446: race detected during execution of test
@jeremyje jeremyje added the bug Something isn't working label Jun 2, 2023
@JohnRoesler
Copy link
Contributor

I can't reproduce, but I have a fix in main. Will you try it please and let me know?

go get github.com/go-co-op/gocron@main

@jeremyje
Copy link
Author

jeremyje commented Jun 2, 2023

That one got fixed but I have another error on stop now. I'm running on a Linux VM using Go 1.20.4.

CGO_ENABLED=1 go test -race internal -count 100
==================
WARNING: DATA RACE
Write at 0x00c00001a2b8 by goroutine 306:
  runtime.racewrite()
      <autogenerated>:1 +0x24
  github.com/go-co-op/gocron.(*executor).stop.func1()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:247 +0x47
  sync.(*Map).Range()
      /usr/local/go/src/sync/map.go:476 +0x1db
  github.com/go-co-op/gocron.(*executor).stop()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:245 +0xd3
  github.com/go-co-op/gocron.(*Scheduler).stop()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/scheduler.go:900 +0xb2
  github.com/go-co-op/gocron.(*Scheduler).Stop()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/scheduler.go:894 +0x56
  internal.TestAddCronJob()
      /internal/cronfail_test.go:69 +0x3a4
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1629 +0x47

Previous read at 0x00c00001a2b8 by goroutine 312:
  runtime.raceread()
      <autogenerated>:1 +0x24
  github.com/go-co-op/gocron.(*jobFunction).singletonRunner()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:83 +0xb1
  github.com/go-co-op/gocron.(*executor).runJob.func2()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:169 +0x39

Goroutine 306 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1629 +0x805
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:2036 +0x8d
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1576 +0x216
  testing.runTests()
      /usr/local/go/src/testing/testing.go:2034 +0x87c
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:1906 +0xb44
  main.main()
      _testmain.go:47 +0x2e9

Goroutine 312 (finished) created at:
  github.com/go-co-op/gocron.(*executor).runJob()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:169 +0x21c
  github.com/go-co-op/gocron.(*executor).limitModeRunner()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:111 +0x206
  github.com/go-co-op/gocron.(*executor).run.func2()
      /home/coder/go/pkg/mod/github.com/go-co-op/[email protected]/executor.go:191 +0x39
==================
--- FAIL: TestAddCronJob (0.00s)
    testing.go:1446: race detected during execution of test
FAIL
FAIL    internal     0.039s
FAIL
go version
go version go1.20.4 linux/amd64

uname -a
Linux a5c36d205060 5.15.0-72-generic #79-Ubuntu SMP Wed Apr 19 08:22:18 UTC 2023 x86_64 GNU/Linux

lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 11 (bullseye)
Release:        11
Codename:       bullseye

@JohnRoesler
Copy link
Contributor

@jeremyje I'm unable to reproduce the issue (on an M2 mac). Are you able to produce it in any other environments? Within docker would be ideal so I can run the same. I'm confused by this new race you're seeing because as I understand waitgroups are goroutine safe. It seems strange that waitgroup.Wait() is being considered a write 🤔

@jeremyje
Copy link
Author

jeremyje commented Jun 2, 2023

I ran this test within a debian Linux docker image hosted on a Ubuntu Linux VM.

You can see that information via the uname and lsb_release commands above.

The important thing is the test needs to run many times. I picked 100.

Since your on a Mac if this project uses GitHub actions running the test on that likely will be closer to the hardware I have.

I'm on an AMD Ryzen 5900X, 12-core CPU.

@JohnRoesler
Copy link
Contributor

I was able to reproduce with a count=10000. Getting rid of this particular race will take some consideration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants