Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No VisibilityTimeout Renewal #291

Closed
slang25 opened this issue Jan 10, 2017 · 6 comments
Closed

No VisibilityTimeout Renewal #291

slang25 opened this issue Jan 10, 2017 · 6 comments
Labels

Comments

@slang25
Copy link
Member

slang25 commented Jan 10, 2017

When we take messages from an SQS queue, we do so with a VisibilityTimeout (Docs) so that as we process the message, no other worker can see the inflight message, this acts as a peek lock. It's important because should anything bad happen while processing the message, we know that it will get processed again, there is no responsibility on the worker to "put it back". The default timeout is 30 seconds, and is configurable within JustSaying.

However we have now enforced our handlers to complete their work within this timeout, otherwise another worker will start processing the same message, and now will be doing concurrent work, and possibly duplicate work, as there is no hard requirement for consumers to make their processing idempotent (although SQS doesn't make any guarantees of exactly once delivery).

I suggest we add a visibility timeout renewal mechanism.

Here is an example of how it could work; once we take a message off the queue, using Task.Run passing in a CancellationToken, we use the same timeout the message has and await a Task.Delay(messageTimeout,ct), then use ChangeMessageVisibilityAsync to renew the timeout for another duration, then loop until the CT is cancelled. The CT will be cancelled once the handler has completed. If that doesn't make sense I will follow up with a rough PR.

Pros:

  • Safer defaults for people not wanting to think hard about configuration. aka "Pit of success"
  • Better robustness when handlers are delayed abnormally.
  • Visibility timeouts could be set more aggressively, whereas currently they need to be conservative as to not hit this scenario.

Cons:

  • More complexity
  • Task.Run or however you cut it, will be requiring another thread from the pool (per message), even if only very breifly.
  • Must be careful not to introduce any new race conditions, memory leaks, or bugs in general.

With SQS FIFO queues on the way, and our current IMessageLock behavior which may offer some overlap, is this a sensible addition? What are peoples thoughts?

@slang25 slang25 changed the title No VisibilityTimeout Renew No VisibilityTimeout Renewal Jan 10, 2017
@je-ewan
Copy link

je-ewan commented Jan 10, 2017

surely expected behaviour for a worker queue is that is that if a message is not processed after 'timeout period' the job is assigned to another worker.

It sounds like you would want to remove this timeout completely. which, for me, would introduce unexpected behaviour?

@AnthonySteele
Copy link
Contributor

AnthonySteele commented Jan 10, 2017

If you want to progress this project there are more urgent things, like ending the far-too-lengthy split between master and develop, getting incremental test versions going out again, and finding a locking strategy that works and isn't backed by DynamoDb.

Also a loty of users scratch their heads over the more complex setups. Needs better debug info coming out about what queue is connected to what topic, what handler is connected to what queue, and the fate of a message that might be sent to zero or more handlers, and is this what they expect especially in the zero case.

@stale
Copy link

stale bot commented Jul 20, 2018

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jul 20, 2018
@stale
Copy link

stale bot commented Jul 27, 2018

Closed automatically due to inactivity.

@stale stale bot closed this as completed Jul 27, 2018
@stale stale bot removed the wontfix label Aug 26, 2019
@stale
Copy link

stale bot commented Nov 24, 2019

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Nov 24, 2019
@stale
Copy link

stale bot commented Dec 8, 2019

Closed automatically due to inactivity.

@stale stale bot closed this as completed Dec 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants