Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

django_apscheduler.models.DjangoJobExecution.MultipleObjectsReturned #156

Closed
kylowrobelek opened this issue Oct 6, 2021 · 15 comments
Closed
Labels

Comments

@kylowrobelek
Copy link

We currently use your package in AWS environment in our project. But during its usage we face problem with multiple DjangoJobExecution creation. It is probably about race condition when new container is created (during deploying).
Do you support such an activity?
We use setup_scheduler() (same body as your tutorial page) in django app ready() method.

Here is traceback:

DjangoJobExecution.MultipleObjectsReturned: get() returned more than one DjangoJobExecution -- it returned 2!
File "apscheduler/schedulers/base.py", line 836, in _dispatch_event
cb(event)
File "django_apscheduler/jobstores.py", line 100, in handle_execution_event
job_execution = DjangoJobExecution.atomic_update_or_create(
File "django_apscheduler/util.py", line 99, in func_wrapper
result = func(*args, **kwargs)
File "django_apscheduler/models.py", line 165, in atomic_update_or_create
job_execution = DjangoJobExecution.objects.select_for_update().get(
File "django/db/models/query.py", line 439, in get
raise self.model.MultipleObjectsReturned(

@jcass77
Copy link
Owner

jcass77 commented Oct 6, 2021

Starting a scheduler as part of your Django application, even in the ready() method, can lead to duplicates since multiple apps will be initialized by multiple webserver worker processes - please see README for details.

The scheduler should be started via a dedicated admin command instead.

@cheradenine
Copy link

I'm also seeing this error and I am running it via a management command so there should be only one process running. Same call stack

@jcass77
Copy link
Owner

jcass77 commented Nov 16, 2021

Looking at the code, the only way I can see that happening is if (a) the same ID is given to multiple jobs and / or (b) the system clock is changed so that jobs are persisted with the same runtime information.

Can you provide more details of your specific use case?

@cheradenine
Copy link

I have a single job that runs every 30 minutes. It then checks our database for upcoming events and sends out email reminders for events that are 1 day from now or 2 hours from now. Looking at the job executions table it looks like some of them end up in the 'started executing' state and are never marked finished. No idea if that has anything to do with hitting this error. I am happy to try to help diagnose this and I can see how hard it is to reproduce.

@jcass77
Copy link
Owner

jcass77 commented Nov 17, 2021

Looking at the job executions table it looks like some of them end up in the 'started executing' state and are never marked finished.

Although annoying, this shouldn't cause duplicate records to be created. It would still be worth the effort to investigate the issue to understand why APScheduler is not firing the EVENT_JOB_EXECUTED for your jobs though as this might be indicative of a problem elsewhere.

In terms of the duplicate entries, I have pushed commit f49c883 which adds a constraint at database level to ensure that unique DjangoJobExecutions are created for a particular DjangoJob for a specific run time. This won't fix your problem, but it will at least raise an exception when the duplicate is created, which should make debugging easier.

Please check it out and report any changes here.

@denisyakimov07
Copy link

I do have the same problem on Heroku. I see multiple (2 times) job executions. I saw recommendations to start project like -> python manage.py runserver --noreload, but I don't know how to add parameter on heroku server.

@rtpg
Copy link

rtpg commented Jan 28, 2022

Hey @jcass77 , we are hitting the same issue and will try out your commit, but would you be able to make a pypi release later on including that change? When using git pins we lose nice things like dependabot checks

@jcass77
Copy link
Owner

jcass77 commented Jan 28, 2022

Sure, let my know if it works for you and I will cut a new release.

@CharlesFr
Copy link

I'm also getting this error on Heroku, has there been a fix for this?

2022-02-18T09:30:00.680819+00:00 app[clock.1]: INFO apscheduler.executors.default: Job "update_account (trigger: cron[minute='*/30'], next run at: 2022-02-18 10:00:00 UTC)" executed successfully
2022-02-18T09:30:00.690086+00:00 app[clock.1]: ERROR apscheduler.scheduler: Error notifying listener
2022-02-18T09:30:00.690087+00:00 app[clock.1]: Traceback (most recent call last):
2022-02-18T09:30:00.690088+00:00 app[clock.1]: File "/app/.heroku/python/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 836, in _dispatch_event
2022-02-18T09:30:00.690088+00:00 app[clock.1]: cb(event)
2022-02-18T09:30:00.690088+00:00 app[clock.1]: File "/app/.heroku/python/lib/python3.8/site-packages/django_apscheduler/jobstores.py", line 100, in handle_execution_event
2022-02-18T09:30:00.690089+00:00 app[clock.1]: job_execution = DjangoJobExecution.atomic_update_or_create(
2022-02-18T09:30:00.690090+00:00 app[clock.1]: File "/app/.heroku/python/lib/python3.8/site-packages/django_apscheduler/util.py", line 99, in func_wrapper
2022-02-18T09:30:00.690091+00:00 app[clock.1]: result = func(*args, **kwargs)
2022-02-18T09:30:00.690091+00:00 app[clock.1]: File "/app/.heroku/python/lib/python3.8/site-packages/django_apscheduler/models.py", line 165, in atomic_update_or_create
2022-02-18T09:30:00.690091+00:00 app[clock.1]: job_execution = DjangoJobExecution.objects.select_for_update().get(
2022-02-18T09:30:00.690092+00:00 app[clock.1]: File "/app/.heroku/python/lib/python3.8/site-packages/django/db/models/query.py", line 439, in get
2022-02-18T09:30:00.690092+00:00 app[clock.1]: raise self.model.MultipleObjectsReturned(
2022-02-18T09:30:00.690093+00:00 app[clock.1]: django_apscheduler.models.DjangoJobExecution.MultipleObjectsReturned: get() returned more than one DjangoJobExecution -- it returned 2!

@denisyakimov07
Copy link

For some reason on heroku host, the apscheduler, always double tasks. I did try to fix the problem, after a few days I start to use post requests on my webhook based on [Heroku Scheduler]

@Roconda
Copy link

Roconda commented Mar 3, 2022

Ran into this issue on our Kubernetes cluster as well. Found out it had to do with the deployment cycle. While it does return the error when multiple instances of the apscheduler are active it does seem to recover itself from the error.

@rtpg
Copy link

rtpg commented Mar 4, 2022

@jcass77 we have been running this for a month or so now in a production environment and haven't hit any issues, so I think this is good for a release!

@jcass77
Copy link
Owner

jcass77 commented Mar 5, 2022

v0.6.1 has been released. This won't prevent MultipleObjectsReturned but will make root causing easier.

If you are using Heroku or Kubernetes, you need to be sure that you only have one instance of the scheduler running (I'm not familiar with those platforms so can't offer any advice unfortunately).

@jcass77 jcass77 closed this as completed Mar 5, 2022
@rtpg
Copy link

rtpg commented Mar 7, 2022

Oh I think I know what's going on.

in DjangoJobExecution.atomic_update_or_create we have the select_for_update on the job execution itself, meaning we don't have updates clobbering each other (at least in theory, I... kinda think that lock probably belongs higher up the statck).

But we aren't locking the DjangoJob itself, so there's a potential race with multiple schedulers running, since multiple ones will show up with the read run_time and executing on that.

In some similar race-y code, what we have done is "lock all objects, then reconfirm conditions". Here it feels like what we probably want to do is:

  • lock the DjangoJob itself with a select_for_update
  • after locking, reconfirm the run_time.... maybe? Since it might no longer be valid
  • lock the DjangoJobExecution for updates

Since the idea here is that not only do we not want concurrent updates.....

I think that locking the DjangoJob would at least prevent the basic race here where we have two schedulers simultaneously failing the get then creating the executions.

@jcass77
Copy link
Owner

jcass77 commented Mar 7, 2022

Running multiple schedulers that share the same job store is BadTM (at least until such time that APScheduler 4.0 is released). Right now it is an anti-pattern that we probably should not try to enable via complex locking behaviour.

If you are running on Heroku, then this thread containing Heroku-specific instructions in the upstream APScheduler issue log might be of interest to help you avoid starting up multiple schedulers in different web workers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants