-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent hash exchange bindings are changed and break when a node restarts #3386
Comments
I only see two options:
Binding recovery will happen on node boot regardless of what we do. Besides bindings themselves, exchange state is a really rare edge case that's not really handled in RabbitMQ. Ideas and PRs would be appreciated, our team doesn't have the capacity to redesign this plugin any time soon. |
I think it would be fine to not apply the binding if we already have weight number of entries - though it makes it harder that you can bind the queue multiple times. Maybe as a quick fix just resolving the counter(?) which goes wrong with the buckets would be fine so at least there is a queue where the message goes even if it's not consistent. |
We can ignore the fact that you can bind a queue multiple times as that's clearly something the binding recovery process and such a stateful exchange type cannot handle (we don't know what other bindings are going to be recovered down the line). For this exchange type, multiple bindings likely is a result of a mistake. We can assume that if you have enough ring entries ("weight") for the recovered binding, we can do nothing. |
Very interested in a resolution to this issue! I'm seeing the same issue, where node restarts cause the exchange to become unusable, though I'm not seeing the binding count increase. |
Turns out I didn't really resolve this in #3594... When bindings are copied over when a new node comes up, they aren't sorted anymore so An example after node removal:
Which means the ring breaks when one of those bindings is removed
This also can happen by doing the following
I imagine this can be fixed with some sorting in
I wonder if it would be better to just reject new bindings with the same name? Any suggestion @michaelklishin ? |
I think that maybe I'd say rejecting a binding may be unexpected for some applications, though most probably for 99% of users it does not matter (and now it's broken anyway). |
Hi all! I've experienced same issue last week ("Bucket not found" in log, no routing out of exchange) and wondering around for fix. #3594 is merged to 3.8.24 (we have 3.8.3 versioned cluster at the moment) so I may update to get partly fix. What about adding new nodes which is the main way to scale cluster? Is it unsafe for 'x-consistent-hash' exchanges ? |
Hi all! Is there have any updates? Heyang Regards |
Hi, From my side, I'd prefer if the plugin rejects rebinding, that would simplify it, but it definitely needs some more thinking and implementation. :) |
Hi, I made some tests and here are my findings:
I tested this on 3.9.13. |
Consistent hash exchange is the only option as far as I am aware in RabbitMQ to be able to maintain the order of dependent messages while scaling up the number of consumers. This plugin was also added in Amazon MQ but apparently with this particular bug it ends up rendering this plugin unusable. I would say it is quite a critical bug for anyone thinking on using this particular plugin, since a node restart within a cluster is quite a standard operation. Is there any way this can be prioritized in the roadmap @michaelklishin ? |
@FalconerTC is there any chance on fixing this issue? Or are there any known workarounds for this aside from full rebinding? We're stuck with this issue in our application also. |
@nunoguerreirorosa this is open source software, so you if want something "prioritized", dive in and provide a set of steps to reasonably reliably reproduce, then feel free to look into the root cause. Pressing others to "prioritize" something in a piece of software you get for free is not how open source software evolves. Consistent hashing exchange, unlike every other "built-in" (or tier 1) exchange type, is stateful. Booting nodes would recover ring state while applications can already begin modifying the same state by binding or unbinding. Besides delaying client connection listener startup @luos ring state recovery does not involve creating any bindings. The plugin uses a list of @stgolem we don't understand what the issue is or how to reproduce it. You are welcome to |
I now see some investigations above 👍 Making binding addition idempotent sounds like the right thing to do (removal, of course, already is). This is a behavior change so we either have to try to squeeze it into 3.10 or anger the SemVer lovers and ship it in a patch release (even though it would count as a bug fix to many). |
Other team members suggest that we cannot delay 3.10 over this, so it would have to go into a patch release at some point. |
I do not think that's a problem. It is a bug and now it is quite broken, I'd say it is not recommended to use in a clustered environment, so if it gets fixed at some point, it will be a bug fix. |
Thanks for your time. We have exactly same issue as others with this plugin in cluster environment and node restarts. We will gently wait for this issue to resolve in some next release. As a user i classify this as a bug. |
Greetings! I have a case how to reproduce "bucket not found" error.
RMQ runs as a cluster of three nodes. One exchange and four queues has been created. 1 & 3 queues are located on node 2, 2 & 4 - on node 1. Four bindings with routing key "1" has been created also. Check the ring status:
Send a message with routing key 101 in the exchange. The message has been routed to queue 'test1'. Next, stop node 1 with
Next, start node 1 with
Next, stop node 2 and check the ring status:
Next, start node 2 and check the ring status:
Finally, send a message with routing key 101 in the exchange. Result: the message published, but not routed. Logs: cc: @luos @michaelklishin |
Is this planned to be fixed in both 3.9.X and 3.10.X? Are there any recommendations on a quicker way to resolve this after container recreation other than simply removing and recreating the exchange? |
I have been looking at how we can make binding addition and deletion idempotent for this exchange type. I see only one simple way: to keep track of what This has one limitation. Code that repeatedly binds the same exchange and queue with different routing keys/weights is no longer going to work:
will lead to a single binding with "weight = 1". I find this acceptable but there may be some code that's going to break as a result, or at least the routing distribution will be affected. Trying to make hash ring updates idempotent is going to be too complex for me to agree to. |
If it is the case that the effect of weights is indeterminate in the face of node restarts, then adopting a first wins approach does not seem unreasonable to me. |
The hash ring state record changes would require a schema database migration, which seems like an overkill here. Will try to make the function idempotent using the bindings/routes tables instead which are very stable. |
First attempt at relying on binding data to make hash ring state management idempotent failed. A longer term solution would be to make this the first plugin that adopts Khepri, our Raft-based next generation schema data store. In the short term I'll look into using the state of the ring itself and/or the good old distributed locking module in Erlang. |
First binding wins. Duplicate bindings, i.e. bindings with the same source exchange and same destination queue / exchange but possibly different routing key (weight) are ignored from now on by the consistent hash exchange. This applies only to bindings being added. For bindings being deleted, any duplicate binding (independent of its routing key) will delete all buckets for the given source and destination. (This is to ensure that buckets for a given source and destination can be deleted for upgrades prior to this change. This was also the behaviour prior to this commit, so nothing changes in that regard.) Note that duplicate bindings continue to be created in RabbitMQ. (They are only ignored by the consistent hash exchange.) Adding a binding will perform linear search in the bucket map. This is already stated in the README: "These two operations use linear algorithms to update the ring." The linear search when adding a binding could be optimised by adding another Mnesia table field which will require a new migration and feature flag. Hence, such an optimization is left out in this commit. Fixes #3386.
First binding wins. Duplicate bindings, i.e. bindings with the same source exchange and same destination queue / exchange but possibly different routing key (weight) are ignored from now on by the consistent hash exchange. This applies only to bindings being added. For bindings being deleted, any duplicate binding (independent of its routing key) will delete all buckets for the given source and destination. (This is to ensure that buckets for a given source and destination can be deleted for when upgrading from a version prior to this commit. This was also the behaviour prior to this commit, so nothing changes in that regard.) Note that duplicate bindings continue to be created in RabbitMQ. (They are only ignored by the consistent hash exchange.) Adding a binding will perform linear search in the bucket map. This is already stated in the README: "These two operations use linear algorithms to update the ring." The linear search when adding a binding could be optimised by adding another Mnesia table field which will require a new migration and feature flag. Hence, such an optimization is left out in this commit. Fixes #3386.
First binding wins. Duplicate bindings, i.e. bindings with the same source exchange and same destination queue / exchange but possibly different routing key (weight) are ignored from now on by the consistent hash exchange. This applies only to bindings being added. For bindings being deleted, any duplicate binding (independent of its routing key) will delete all buckets for the given source and destination. (This is to ensure that buckets for a given source and destination can be deleted for when upgrading from a version prior to this commit. This was also the behaviour prior to this commit, so nothing changes in that regard.) Note that duplicate bindings continue to be created in RabbitMQ. (They are only ignored by the consistent hash exchange.) Adding a binding will perform linear search in the bucket map. This is already stated in the README: "These two operations use linear algorithms to update the ring." The linear search when adding a binding could be optimised by adding another Mnesia table field which will require a new migration and feature flag. Hence, such an optimization is left out in this commit. Fixes #3386.
Thanks @SteamUpdate for the excellent reproduction steps in #3386 (comment). #5121 makes adding bindings idempotent and will therefore fix this issue. As @michaelklishin already mentioned this comes with a slight breaking change for applications that previously relied upon adding duplicate bindings (i.e. bindings with the same source exchange and same destination queue but possibly different routing key). |
First binding wins. Duplicate bindings, i.e. bindings with the same source exchange and same destination queue / exchange but possibly different routing key (weight) are ignored from now on by the consistent hash exchange. This applies only to bindings being added. For bindings being deleted, any duplicate binding (independent of its routing key) will delete all buckets for the given source and destination. (This is to ensure that buckets for a given source and destination can be deleted for when upgrading from a version prior to this commit. This was also the behaviour prior to this commit, so nothing changes in that regard.) Note that duplicate bindings continue to be created in RabbitMQ. (They are only ignored by the consistent hash exchange.) Adding a binding will perform linear search in the bucket map. This is already stated in the README: "These two operations use linear algorithms to update the ring." The linear search when adding a binding could be optimised by adding another Mnesia table field which will require a new migration and feature flag. Hence, such an optimization is left out in this commit. Fixes #3386. (cherry picked from commit 878f369)
First binding wins. Duplicate bindings, i.e. bindings with the same source exchange and same destination queue / exchange but possibly different routing key (weight) are ignored from now on by the consistent hash exchange. This applies only to bindings being added. For bindings being deleted, any duplicate binding (independent of its routing key) will delete all buckets for the given source and destination. (This is to ensure that buckets for a given source and destination can be deleted for when upgrading from a version prior to this commit. This was also the behaviour prior to this commit, so nothing changes in that regard.) Note that duplicate bindings continue to be created in RabbitMQ. (They are only ignored by the consistent hash exchange.) Adding a binding will perform linear search in the bucket map. This is already stated in the README: "These two operations use linear algorithms to update the ring." The linear search when adding a binding could be optimised by adding another Mnesia table field which will require a new migration and feature flag. Hence, such an optimization is left out in this commit. Fixes #3386. (cherry picked from commit 878f369) (cherry picked from commit ab4f8a9)
Hi,
When a single node is restarted out of a cluster the consistent hash exchange will add existing bindings again into the ring.
Reproduction:
Output:
After this cleaning up the bindings and recreating them breaks the exchange:
The text was updated successfully, but these errors were encountered: