-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle timeouts possible in Khepri minority in rabbit_db_exchange
#11785
Merged
the-mikedavis
merged 8 commits into
main
from
md/khepri-minority-errors/rabbit_db_exchange
Jul 24, 2024
Merged
Handle timeouts possible in Khepri minority in rabbit_db_exchange
#11785
the-mikedavis
merged 8 commits into
main
from
md/khepri-minority-errors/rabbit_db_exchange
Jul 24, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
A common case for exchange deletion is that callers want the deletion to be idempotent: they treat the `ok` and `{error, not_found}` returns from `rabbit_exchange:delete/3` the same way. To simplify these callsites we add a `rabbit_exchange:ensure_deleted/3` that wraps `rabbit_exchange:delete/3` and returns `ok` when the exchange did not exist. Part of this commit is to update callsites to use this helper. The other part is to handle the `rabbit_khepri:timeout()` error possible when Khepri is in a minority. For most callsites this is just a matter of adding a branch to their `case` clauses and an appropriate error and message.
It's unlikely that these operations will time out since the serial number is always updated after some other transaction, for example adding or deleting an exchange. In the future we could consider moving the serial updates into those transactions. In the meantime we can remove the possibility of timeouts by giving the serial update unlimited time to finish.
The spec of `rabbit_exchange:declare/7` needs to be updated to return `{ok, Exchange} | {error, Reason}` instead of the old return value of `rabbit_types:exchange()`. This is safe to do since `declare/7` is not called by RPC - from the CLI or otherwise - outside of test suites, and in test suites only through the CLI's `TestHelper.declare_exchange/7`. Callers of this helper are updated in this commit. Otherwise this commit updates callers to unwrap the `{ok, Exchange}` and bubble up errors.
michaelklishin
approved these changes
Jul 24, 2024
`rabbit_amqp_management` returns HTTP status codes to the client. 503 means that a service is unavailable (which Khepri is while it is in a minority) so it's a more appropriate code than the generic 500 internal server error.
ansd
reviewed
Jul 24, 2024
Co-authored-by: David Ansari <[email protected]>
ansd
reviewed
Jul 24, 2024
ansd
reviewed
Jul 24, 2024
the-mikedavis
commented
Jul 24, 2024
2bf9298
to
2259f84
Compare
This fixes a potential crash in `rabbit_amqp_amanegment` where we tried to format the exchange resource as a string (`~ts`). The other changes are cosmetic.
2259f84
to
b56abee
Compare
michaelklishin
added a commit
that referenced
this pull request
Jul 24, 2024
Handle timeouts possible in Khepri minority in `rabbit_db_exchange` (backport #11785)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Continuation of the changes like in #11685 and #11706 but for exchanges.
The changes for exchanges are a bit more involved than prior PRs. Here's a summary of the changes:
rabbit_exchange:declare/7
's spec is updated to return{ok, #exchange{}} | {error, timeout}
where it returned just#exchange{}
beforerabbit_exchange:delete/3
need some small changes to handle an{error, timeout}
returnrabbit_exchange:ensure_deleted/3
wrapper that covers the common case for callers whereok | {error, not_found}
are handled the same way.timeout => infinity
for calls for updating the exchange serial