Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash between lock and start of a job renders it stuck #258

Open
schwarz-b5c opened this issue Feb 9, 2021 · 0 comments
Open

Crash between lock and start of a job renders it stuck #258

schwarz-b5c opened this issue Feb 9, 2021 · 0 comments

Comments

@schwarz-b5c
Copy link

If the RunCommand crashes after locking a job but before starting it the job is stuck being locked and won't be picked up again, even by the same worker. While unlikely this is not impossible. Here are some stats from a production system of ours:

> SELECT state, count(*) FROM jms_jobs GROUP BY state ORDER BY state;
   state    │  count  
────────────┼─────────
 canceled   │       6
 failed     │    3323
 finished   │ 1792415
 incomplete │       7
 pending    │       8
 running    │      18

All of the 8 pending jobs have been locked by a worker which crashed (or maybe was forcefully terminated) before starting the job. The jobs have been stuck in the pending state for over a month.

Unfortunately RunCommand::cleanUpStaleJobs() doesn't unlock jobs of the same worker. It probably should do so, right?

Some version information from the affected system:

CentOS 7.9.2009
Linux 3.10.0-1160.11.1.el7.x86_64
PHP 7.3.26 (with pcntl)
symfony/symfony 3.4.47
jms/job-queue-bundle 2.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant