-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking: PubSub performance and UX improvements #7156
Labels
C:events
Component: Events
stale
for use by stalebot
T:code-hygiene
General cleanup and restructuring of code to provide clarity, flexibility, and modularity.
Milestone
Comments
creachadair
added
the
T:code-hygiene
General cleanup and restructuring of code to provide clarity, flexibility, and modularity.
label
Oct 26, 2021
This was referenced Oct 26, 2021
creachadair
pushed a commit
that referenced
this issue
Nov 1, 2021
Updates #7156, and a follow-up to #7070. Event subscriptions in Tendermint currently use a fixed-length Go channel as a queue. When the channel fills up, the publisher immediately terminates the subscription. This prevents slow subscribers from creating memory pressure on the node by not servicing their queue fast enough. Replace the buffered channel used to deliver events to buffered subscribers with an explicit queue. The queue provides a soft quota and burst credit mechanism: Clients that usually keep up can survive occasional bursts, without allowing truly slow clients to hog resources indefinitely.
This was referenced Nov 2, 2021
creachadair
pushed a commit
that referenced
this issue
Nov 5, 2021
This is part of the work described by #7156. Remove "unbuffered subscriptions" from the pubsub service. Replace them with a dedicated blocking "observer" mechanism. Use the observer mechanism for indexing. Add a SubscribeWithArgs method and deprecate the old Subscribe method. Remove SubscribeUnbuffered entirely (breaking). Rework the Subscription interface to eliminate exposed channels. Subscriptions now use a context to manage lifecycle notifications. Internalize the eventbus package.
4 tasks
tychoish
pushed a commit
to tychoish/tendermint
that referenced
this issue
Nov 19, 2021
…t#7231) This is part of the work described by tendermint#7156. Remove "unbuffered subscriptions" from the pubsub service. Replace them with a dedicated blocking "observer" mechanism. Use the observer mechanism for indexing. Add a SubscribeWithArgs method and deprecate the old Subscribe method. Remove SubscribeUnbuffered entirely (breaking). Rework the Subscription interface to eliminate exposed channels. Subscriptions now use a context to manage lifecycle notifications. Internalize the eventbus package.
Another issue to follow up on here: |
This was referenced Jan 23, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C:events
Component: Events
stale
for use by stalebot
T:code-hygiene
General cleanup and restructuring of code to provide clarity, flexibility, and modularity.
☂️ This issue tracks a handful of improvements to the Tendermint pubsub library targeting the v0.36 release.
See also RFC 006 Event Subscription and ADR 075 RPC Event Subscription.
In Scope
The overall goal of this work is to address the following performance, API, and usability concerns:
The API supports "unbuffered" (blocking) subscriptions, which stall the entire publisher with no timeout until serviced. This is a special case to support event indexing, but it means that indexing can stall subscriber service, and vice versa, and that feedback can stall or slow consensus.
Ordinary ("buffered") subscriptions are use a fixed-length Go channel as a queue, and if a client does not service its subscriptions fast enough (i.e., the buffer fills), the publisher will terminate the subscription. However, events do not arrive at an even pace, and a large bolus of events may overwhelm the channel in a very short period of time, even if a client is servicing its events optimally (see for example Tendermint emits events over WebSocket faster than any clients can pull them if tx includes many events #6729). (Enqueues take nanoseconds or microseconds; network delivery takes milliseconds, even for fast local connections)
The publish/subscribe plumbing is very complicated, and tightly coupled with indexing. This is mainly a maintenance issue, but also adds overhead that interacts negatively with the stall-pushback on the rest of consensus.
Out of Scope
This issue does not address broader design questions: For example, as part of the pluggable indexing work (see Tracking: Pluggable custom event indexing #7135), it could make sense offload indexing and event subscription from the node process entirely. Such questions should be addressed via the ADR process.
This issue also does not address changes to the RPC subscription interface. That topic is covered by Tracking: Tendermint RPC improvements #7157.
Related Changes
The text was updated successfully, but these errors were encountered: