Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pubsub: pull message slow after update #2540

Closed
smit-aterlo opened this issue Jun 30, 2020 · 14 comments
Closed

pubsub: pull message slow after update #2540

smit-aterlo opened this issue Jun 30, 2020 · 14 comments
Assignees
Labels
api: pubsub Issues related to the Pub/Sub API. priority: p2 Moderately-important priority. Fix may not be included in next release. status: investigating The issue is under investigation, which is determined to be non-trivial. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@smit-aterlo
Copy link

smit-aterlo commented Jun 30, 2020

Client

PubSub v1.4.0

Environment

GKE

Go Environment

$ go version
go version go1.14.4 linux/amd64

Code

I used following config file when I was using v1.2.0 and it was reading all the messages with high throughput, but with the same config and Pubsub v1.4.0 (Yes I upgraded recently) read throughput is really slow.

subscription.ReceiveSettings.MaxOutstandingMessages: 10
subscription.ReceiveSettings.NumGoroutines: 1

Expected behavior

Messages read rate is very high which leads to high throughput

Actual behavior

Messages read rate is very slow and some of the subscription not pulling any messages

@smit-aterlo smit-aterlo added the triage me I really want to be triaged. label Jun 30, 2020
@product-auto-label product-auto-label bot added the api: pubsub Issues related to the Pub/Sub API. label Jun 30, 2020
@hongalex hongalex added status: investigating The issue is under investigation, which is determined to be non-trivial. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed triage me I really want to be triaged. labels Jun 30, 2020
@hongalex
Copy link
Member

Hi, thanks for taking the time to file an issue.

What's your publish rate, and how long does it take for you to handle an individual message?

Also, there's been a change recently where Pub/Sub version is no longer tied to the general cloud.google.com/go package version. The latest version of cloud.google.com/go/pubsub is actually v1.4.0. Can you check to see if you've pulled that in?

@smit-aterlo
Copy link
Author

I am really sorry for the confusion. Yes, I am using pubsub 1.4. My cloud.google.com/go version was 0.53.0 which is now v0.59.0.

The publishing rate is around 150k-200k per second. With 3 processes each with 4 CPU but non of the processes are using full CPU or memory. Handling is good during the start but after a while, it processes around half of the messages

@hongalex
Copy link
Member

Can you try increasing MaxOutstandingMessages? The default is 1000, so 10 is sort of low. MaxOutstandingMessages controls how many messages can be pulled in by the client at once, as well as the number of callback functions spawned to handle messages.

@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Jul 1, 2020
@jesushernandez
Copy link

jesushernandez commented Jul 1, 2020

@hongalex We've recently had a similar situation, where an update to the latest pubsub version (v1.4.0, although we saw the same in v1.3.1) caused our subscribers to stop pulling messages. We've experienced this in services which have a reasonably high number of subscribers (> 70). In these cases, subscription.Receive (in async mode) would block forever without receiving any messages.

We then tried manually pulling messages (via a standalone SubscriptionClient), making the grpc calls ourselves. In this case, we were always receiving a fraction of the messages we were asking for.

Eventually, we have found out that by increasing the number of connections available to the SubscriberClient (bigger connection pool), the issue goes away (probably meaning the connection was saturated). Which brings us to this. Why is the library leaving the SubscriberClient with a single connection while the PublisherClient can have up to 3 (assuming numConns=4)? Is there any reason behind this?

The way we've worked around this is by instantiating:

  • a pubsub.Client with pubsub.NewClient (which we use to do things like managing topics, subscriptions and publishing)
  • a pubsub.SubscriberClient with pubsub.NewSubscriberClient (which we exclusively use to call Pull to get new messages on a loop, building the request message ourselves)

Would you consider this a typical use case? Or you would expect most of the people to stick with just the pubsub.Client to do all operations. If so, why does the subscriber only get one of all the connections?

@smit-aterlo
Copy link
Author

@jesushernandez Yes we have around 200+ subscriptions and I can see that it is not pulling from some of the subscriptions at all.

@smit-aterlo
Copy link
Author

@jesushernandez can you please help me whether you needed to update this config to higher or not after that numConnection ?change

subscription.ReceiveSettings.MaxOutstandingMessages: 10
subscription.ReceiveSettings.NumGoroutines: 1

@jesushernandez
Copy link

@smit-aterlo We are not using the subscription.Receive API anymore. Instead, we're using the grpc endpoint to pull messages. Here's our code https://github.com/lileio/pubsub/blob/master/providers/google/google.go

It used to be much simpler with the streaming pull via subscription.Receive but we were having the issues we discussed above with it.

@smit-aterlo
Copy link
Author

OK thank you very much for the help @jesushernandez

@jesushernandez
Copy link

No problem. Use that code as inspiration as it is still under active development and it still lacks proper testing.

In any case, this is an interim solution. I'm still waiting to know if there's anything else we could have tried or we're doing wrong.

@smit-aterlo
Copy link
Author

Yes, that is true. Meanwhile @hongalex if you have any solution with subscription.Receive API please let us know.

@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Jul 5, 2020
@codyoss codyoss removed the triage me I really want to be triaged. label Jul 6, 2020
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Jul 6, 2020
@codyoss codyoss added priority: p2 Moderately-important priority. Fix may not be included in next release. and removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Jul 6, 2020
@iagomelanias
Copy link

Related #2593

@hongalex
Copy link
Member

Hiya, apologies for the delay. Can y'all try updating to cloud.google.com/go/pubsub v1.6.0 to see if this fixes your issue?

@iagomelanias
Copy link

I can confirm our subscriptions are now pulling messages much faster and no messages are stuck. Thank you, @hongalex!

@smit-aterlo
Copy link
Author

Yes, it resolved the issue we were having. Thank you for the help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the Pub/Sub API. priority: p2 Moderately-important priority. Fix may not be included in next release. status: investigating The issue is under investigation, which is determined to be non-trivial. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

6 participants