Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zeebe worker stops processing after instances processed = max_task_count #201

Closed
Andy-JB opened this issue Aug 4, 2021 · 6 comments
Closed
Labels
3.0.0 Will be released in Zeebe 3.0.0 bug Something isn't working

Comments

@Andy-JB
Copy link
Contributor

Andy-JB commented Aug 4, 2021

Describe the bug
When the number of instances waiting in Zeebe is greater than max_task_count, starting a Zeebe worker will process max_task_count instances before ceasing to process.

To Reproduce
Steps to reproduce the behavior:

  1. Start Zeebe (with no worker running)
  2. Create a number of instances > max_task_count
  3. Start Zeebe worker
  4. max_task_count number of instances will be processed by Zeebe worker.
  5. Zeebe worker does not process any more instances. It will need to be restarted to continue processing

Expected behavior
Zeebe worker continues processing all instances in Zeebe.

Screenshots

Before starting Zeebe worker with max_task_count=32:

image

After starting Zeebe worker:

image

Desktop (please complete the following information):

  • OS: iOS 11.4
  • Browser: Chrome
  • Version: Zeebe 1.1.0, pyzeebe 3.0.0rc2

Extra info
Tested using the worker example from the docs and zeebe-docker-compose

This bug also occurs on Kubernetes v1.11.0

@Andy-JB
Copy link
Contributor Author

Andy-JB commented Aug 16, 2021

I've tested a fix for this issue in job_poller.py.
If calculate_max_jobs_to_activate() > 0 returns false, the thread will wait 5 seconds before checking again.

The wait time could also be an argument of ZeebeWorker.

Hi @JonatanMartens, what do you think? Would you like me to submit a pull request?

    async def poll(self):
        while self.should_poll():
            while self.calculate_max_jobs_to_activate() > 0:
                await self.poll_once()
            await asyncio.sleep(5)
...

    def should_poll(self) -> bool:
        return not self.stop_event.is_set() \
            and (self.zeebe_adapter.connected or self.zeebe_adapter.retrying_connection) \
            and self.calculate_max_jobs_to_activate() > 0

@JonatanMartens
Copy link
Collaborator

Hey @Andy-JB,

I can't seem to replicate the issue on my machine, can you provide how you use the worker?

It would be really helpful if you could submit a pr, if you can also add a test so that we know it was fixed.

Thank you for opening this issue!

@JonatanMartens JonatanMartens added 3.0.0 Will be released in Zeebe 3.0.0 bug Something isn't working labels Aug 16, 2021
@tomas-zemres
Copy link

We are testing version 3.0.0rc3 and we also found this bug.

It can be simply reproduced by following scenario:

  • Implement worker with simple task with asyncio.sleep(1)
  • stop the worker
  • run 100 new processes in Zeebe
    (in Zeebe is 100 pending workflows now)
  • Start the worker
    • the worker takes 32 tasks and runs 32 x the sleep(1) in parallel, but after that the "self.should_poll()" returns False, because
      self.calculate_max_jobs_to_activate() returns 0. When the self.should_poll() returns False the polling jobs is stopped and following jobs are never processed (until worker is manually restarted)

@maio
Copy link

maio commented Aug 18, 2021

Hello,

basically whenever calculate_max_jobs_to_activate() returns 0 (e.g. when worker is currently processing >32 jobs) JobPoller will stop polling for new jobs because it will return from poll(self) method.

tomas-zemres pushed a commit to verotel/pyzeebe that referenced this issue Aug 18, 2021
"Zeebe worker stops processing after instances processed = max_task_count"
camunda-community-hub#201
@lesnek
Copy link

lesnek commented Aug 31, 2021

We encountered the same issue in our project when traffic became 100 processes/second (incident), when we have max task count 32

Also I like more the "tmp fix" mentioned here, then just waiting

tomas-zemres pushed a commit to verotel/pyzeebe that referenced this issue Sep 6, 2021
"Zeebe worker stops processing after instances processed = max_task_count"
camunda-community-hub#201
@maio
Copy link

maio commented Sep 15, 2021

This issue is fixed for us in pre-release/3.0.0 thanks to merge above. It's maybe OK now to close this? cc: @lesnek

@Andy-JB Andy-JB closed this as completed Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.0.0 Will be released in Zeebe 3.0.0 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants