Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PubSub: Sharding broken with multiple workers or worker restarts #273

Closed
raphaelm opened this issue Sep 1, 2021 · 1 comment
Closed

Comments

@raphaelm
Copy link
Contributor

raphaelm commented Sep 1, 2021

The "classic" implementation selects shards like this:

    def consistent_hash(self, value):
        """
        Maps the value to a node value between 0 and 4095
        using CRC, then down to one of the ring nodes.
        """
        if isinstance(value, str):
            value = value.encode("utf8")
        bigval = binascii.crc32(value) & 0xFFF
        ring_divisor = 4096 / float(self.ring_size)
        return int(bigval / ring_divisor)

The PUBSUB implementation does this:

    def _get_shard(self, channel_or_group_name):
        """
        Return the shard that is used exclusively for this channel or group.
        """
        if len(self._shards) == 1:
            # Avoid the overhead of hashing and modulo when it is unnecessary.
            return self._shards[0]
        shard_index = abs(hash(channel_or_group_name)) % len(self._shards)
        return self._shards[shard_index]

I was surprised that there's a difference when I first looked at this, but now this started to really bite us in production after we introduced sharding: hash() is not guaranteed to return the same thing across multiple invocations of python, even on the same machine, unless PYTHONHASHSEED is set manually, which probably no-one ever does. See https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED

I think we should just use the same hash function across implementations. I'll send a PR :)

raphaelm added a commit to raphaelm/channels_redis that referenced this issue Sep 1, 2021
@raphaelm raphaelm changed the title PubSub: Sharding broken with multiple workers PubSub: Sharding broken with multiple workers or worker restarts Sep 1, 2021
@acu192
Copy link
Collaborator

acu192 commented Sep 2, 2021

Agreed. And yikes: I didn't know hash() was inconsistent across invocations of python.

I'll review your PR next.

@acu192 acu192 closed this as completed in f5eef16 Sep 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants