[Broker] Increase default numHttpServerThreads value to 50 to prevent Admin API unavailability #14320

lhotari · 2022-02-16T13:03:08Z

Motivation

Since Pulsar Admin API uses the blocking servlet API, it is possible that all Jetty threads are occupied and this causes unavailability on the Pulsar Admin API. The default value for the maximum number of threads for Jetty is too low in Pulsar. That is the root cause of many problems where Pulsar Admin API is unavailable when all threads are in use.

Additional context

Mailing list thread about "make async" changes: https://lists.apache.org/thread/tn7rt59cd1k724l4ytfcmzx1w2sbtw7l

Related issues/PRs Avoid call sync method in async rest API for delete subscription #13666 Broker's web threads get stuck #4756 Converted bundle split into an HTTP async operation #10619
PIP-142

Modification

Jetty defaults to 200 maximum threads, to prevent thread pool starvation. Make Pulsar use the value of 50 maximum threads by setting numHttpServerThreads=50.
Update the documentation for numHttpServerThreads

…t Broker Admin API unavailability - numHttpServerThreads is the maximum number of threads. Initial number of threads is 8 when numHttpServerThreads >= 8 - Jetty defaults to 200 maximum threads, to prevent thread pool starvation. - this is the reason to use the value 200 as a default

eolivelli

Good catch

@gaoran10 @michaeljmarshall please consider including this fix in your upcoming RCs

nicoloboschi

LGTM

codelipenghui

Change the default configuration need to start with a proposal

hangc0276

What's the reason of setting the default value to 200? If the node just have one core, what will happen? Please send email to dev mail list to discuss.

lhotari · 2022-02-16T14:06:58Z

What's the reason of setting the default value to 200? If the node just have one core, what will happen? Please send email to dev mail list to discuss.

These are threads. Jetty defaults to 200 maximum threads, to prevent thread pool starvation. This is recommended when using blocking Servlet API. The problem is that Pulsar uses the blocking servlet API and doesn't have a sufficient amount of threads which are needed and recommended.

The value 200 doesn't mean that there will be 200 threads to start with. This is the maximum size for the thread pool. When the value is more than 8, Jetty will start with 8 initial threads and add more threads to the pool when all threads are occupied.

I have already started an email discussion to discuss this topic. Please reply to https://lists.apache.org/thread/byg1g081o6mfj0xn8ntryvb5qplmrjyl .

There is useful background information in https://lists.apache.org/thread/hso8qwsv40ccrk116fj5ggdpt3b9d4g4 . I wrote that reply before I noticed Penghui's response. It contains a link to Jetty's documenation about asynchronous servlets: https://wiki.eclipse.org/Jetty/Feature/Continuations#Why_Asynchronous_Servlets_.3F .

Jason918 · 2022-02-16T14:11:58Z

These are threads. Jetty defaults to 200 maximum threads,

@lhotari If this is jetty defaults. Can we just leave it blank?

lhotari · 2022-02-16T14:13:19Z

Change the default configuration need to start with a proposal

This PR is a proposal. I have also made this proposal on the dev mailing list in the discussion. https://lists.apache.org/thread/byg1g081o6mfj0xn8ntryvb5qplmrjyl . What else is needed?

In this case, the previous default for numHttpServerThreads is simply too small and invalid when blocking servlet API is used.
The value 200 doesn't mean that there will be 200 threads to start with. This is the maximum size for the thread pool. When the value is more than 8, Jetty will start with 8 initial threads and add more threads to the pool when all threads are occupied.

There is no breaking change in increasing the default value to 200. It's just an improvement and fixes "the problem" where Admin API goes unresponsive when all threads are occupied.

We might end up setting the default value to something lower than 200. A value like 50 or 100 might be fine. I just think that 200 is a good default since Jetty also uses that as the default value.

The main overhead of a thread is the amount of memory that the stack of each thread consumes. It's 1MB by default. 200 threads will consume 200MB of RSS memory in the thread stacks.

lhotari · 2022-02-16T14:16:46Z

@lhotari If this is jetty defaults. Can we just leave it blank?

@Jason918 no. Pulsar overrides the default with the value set in numHttpServerThreads in the configuration.

Pulsar code locations:

pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/web/WebService.java

Lines 79 to 81 in b540523

    
           this.webServiceExecutor = new WebExecutorThreadPool( 
        
                   pulsar.getConfiguration().getNumHttpServerThreads(), 
        
                   "pulsar-web");

pulsar/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/web/WebExecutorThreadPool.java

Lines 33 to 36 in adcbe0f

    
           public WebExecutorThreadPool(int maxThreads, String namePrefix) { 
        
               super(maxThreads); 
        
               this.threadFactory = new DefaultThreadFactory(namePrefix); 
        
           }

RobertIndie

This doesn't seem to solve the root cause of the issue.

This PR is a proposal. I have also made this proposal on the dev mailing list in the discussion. https://lists.apache.org/thread/byg1g081o6mfj0xn8ntryvb5qplmrjyl . What else is needed?

I think we need a PIP to get this change approved.

hangc0276 · 2022-02-16T14:34:01Z

These are threads. Jetty defaults to 200 maximum threads

@lhotari Do we need to take the number of availableProcessors into consideration for the maximum threads of the thread pool?

lhotari · 2022-02-16T14:40:27Z

This doesn't seem to solve the root cause of the issue.

Please tell me what is "the issue" that you are referring to?

This PR is a proposal. I have also made this proposal on the dev mailing list in the discussion. https://lists.apache.org/thread/byg1g081o6mfj0xn8ntryvb5qplmrjyl . What else is needed?

I think we need a PIP to get this change approved.

The Pulsar community makes major decisions on the dev mailing list according to the Apache Way. The mailing list is the place to decide whether this change needs a PIP or not. Please participate in the existing dev mailing list discussion: https://lists.apache.org/thread/byg1g081o6mfj0xn8ntryvb5qplmrjyl

shoothzj · 2022-02-16T14:45:07Z

@lhotari @eolivelli I suggest that we should consider the system cpu cores, it may be hurtful change for people who run pulsar in a low machine, like one cpu core.

codelipenghui · 2022-02-17T01:33:59Z

@codelipenghui #14329 has been published. Please follow up on the mailing list.

Sure

BewareMyPower · 2022-02-17T03:47:02Z

We shouldn't add some temporary PRs during the release phase unless they are really important.

@BewareMyPower This PR contains a very important change. The recommended maximum thread count for Jetty thread pool is 50-500. Pulsar current uses an invalid valid. The sync->async changes in the Pulsar Admin API are not needed when a proper value is used.

I think I need to explain more for the important term I used. IMO, a PR that could block a release during the release phase must match following rules:

It must be a bug fix.
The bug was introduced from the current release, i.e. it's a regression.
There is no workaround.

It's only my opinion. I think our release document for release manager missed something like this.

Let's look back to this PR. First, I don't think a change to the default configuration value can be treated as a bug fix. It's more like an enhancement. Because the previous stable releases all should have the same problem. Then, we can see it's not a regression. Third, it's not something serious like Log4j2 Vulnerability (CVE-2021-44228). It just make some certain cases not work for Admin API and can be fixed by configuration tuning.

In short, IMO, after a release started, we must be very careful and strict on the new PRs.

nodece · 2022-02-17T08:59:07Z

The unavailability of the Admin API is not caused by the HTTP server thread, the root cause is that the ZK callback thread is blocked.

When an admin API calls the ZK metadatastore API, it gets the ZK data by call the CompletableFuture, note that we did not use the executor to execute the CompletableFuture#complete() in ZKMetadataStore.java#L171. In ZK callback thread, once the caller converts async to sync calls then the ZK callback thread will be blocked, this code so like: metadata.getAsync().get(30, TimeUnit.SECONDS).

How to solve this problem?

Use an executor to execute the callback that passes data to Pulsar in ZK callback
Don't convert async to sync calls, so there are make some PR that converts sync to async calls

How to reproduce the ZK callback thread is blocked:

docker run -d -p 2181:2181 --name test-zookeeper zookeeper

public class Main {
    private static final long CACHE_REFRESH_TIME_MILLIS = TimeUnit.MINUTES.toMillis(5);

    public static void printThread(String name) {
        System.out.println(name + " thread name -> " + Thread.currentThread().getName());
    }

    public static void main(String[] args) throws Exception {
        ZooKeeper zkc = new ZooKeeper("localhost:2181", 60_000, null);

        System.out.println("Check the zk connect");
        CountDownLatch zkLatch = new CountDownLatch(1);
        new Thread(() -> {
            while (true) {
                if (zkc.getState().isConnected()) {
                    zkLatch.countDown();
                    break;
                }
            }
        }).start();
        if (!zkLatch.await(5, TimeUnit.SECONDS)) {
            throw new Exception("zk connect failed");
        }

        AsyncLoadingCache<String, byte[]> objCache = Caffeine.newBuilder()
                .refreshAfterWrite(CACHE_REFRESH_TIME_MILLIS, TimeUnit.MILLISECONDS)
                .buildAsync((key, executor) -> {
                    CompletableFuture<byte[]> future = new CompletableFuture<>();
                    zkc.multi(Lists.newArrayList(Op.getData("/")), (rc, path, ctx, opResults) -> {
                        printThread("zk callback");
                        future.complete(null);
                    }, null);
                    return future;
                });

        CountDownLatch countDownLatch = new CountDownLatch(1);

        // Reproduce the ZK callback is blocked
        System.out.println("async get start");
        objCache.get("/").whenComplete((unused, ignored) -> {
            printThread("async get done");
            try {
                System.out.println("zk thread will blocked after sync get");
                System.out.println("sync get start");
                objCache.get("/1").get(5, TimeUnit.SECONDS);
                // Unreachable
                printThread("sync get done");
                countDownLatch.countDown();
            } catch (Exception e) {
                e.printStackTrace();
            } finally {
                countDownLatch.countDown();
            }
        });

        countDownLatch.await();
    }
}

lhotari · 2022-02-17T09:40:33Z

When an admin API calls the ZK metadatastore API, it gets the ZK data by call the CompletableFuture, note that we did not use the executor to execute the CompletableFuture#complete() in ZKMetadataStore.java#L171. In ZK callback thread, once the caller converts async to sync calls then the ZK callback thread will be blocked, this code so like: metadata.getAsync().get(30, TimeUnit.SECONDS).

The blocked thread in #13666 is a HTTP server thread.

"pulsar-web-40-28" #238 prio=5 os_prio=0 tid=0x00007f5a4000d800 nid=0x2bcf waiting on condition [0x00007f5961d3b000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000005c5529d40> (a java.util.concurrent.CompletableFuture$Signaller)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
	at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
	at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
	at org.apache.pulsar.broker.admin.impl.PersistentTopicsBase.internalDeleteSubscriptionForNonPartitionedTopic(PersistentTopicsBase.java:1498)

I'll clarify what I have been referring to as "sync -> async" changes: changes where the use of the blocking Servlet API is migrated to use Asynchronous Servlet API.
I understand that it's necessary to not block in Zookeeper callbacks, but that is a different problem, which isn't related to Servlet API change.

lhotari · 2022-02-17T10:03:07Z

@BewareMyPower This PR contains a very important change. The recommended maximum thread count for Jetty thread pool is 50-500. Pulsar current uses an invalid valid.

@lhotari This is not a new thing and it's something that can easily be changed by users. I don't think we need to rush into a release.

I agree. Have I been rushing this? Instead, you can say that a lot of PRs with "make async" have been pushed and merged recently.
This has introduced several known regressions that have been fixed. We don't know which are regressions that just haven't been found yet.

My valid questions were never answered by the contributors of the "make async" changes.

I'm expecting that there are issues or a PIP which is referred to.
@merlimat WDYT?

The sync->async changes in the Pulsar Admin API are not needed when a proper value is used.

The problem is not just keeping threads busy but the cases in which you have calls that spawn to multiple brokers. In this cases, the HTTP call can come back to same broker and we can have a deadlock, no matters how many threads we have in the pool.

I'll clarify: what I have been referring to as "sync -> async" changes: changes where the use of the blocking Servlet API is migrated to use Asynchronous Servlet API. That won't solve any problems on it's own. Any problems that it might solve would be solved also by configuring Jetty as it is recommended to be configured when there are blocking calls involved.
The recommended maximum thread pool size is 50 to 500 for Jetty. I have been going through some details and there are multiple other things that are not properly configured. I'll be following up with separate PRs. (UPDATE: the draft PR which fixes backpressure handling is #14353).

I'd assume that the reason for deadlocks when thread pool size is properly configured are caused by locks. I like to see an example of a deadlock which couldn't be resolved by continuing to use the blocking servlet api. I'm not against the changes from blocking API to async API, but I think changes need proper justification, especially when the "make async" changes have been initiated without referring any reported issues or a PIP.

lhotari · 2022-02-17T10:10:45Z

I think I need to explain more for the important term I used. IMO, a PR that could block a release during the release phase must match following rules:

I haven't requested to block the release.

Let's look back to this PR. First, I don't think a change to the default configuration value can be treated as a bug fix. It's more like an enhancement. Because the previous stable releases all should have the same problem. Then, we can see it's not a

The Jetty documentation recommends values 50-500 for the maximum thread pool size. That is a fact, so there cannot be different opinions on this. Since the default configuration value doesn't fall in the recommended value range, my opinion is that this is a bug. For some people a bug is a feature. :) Does it really matter whether we call this a bug or an improvement?
The fact is that the current value doesn't fall in the recommended value range.

BewareMyPower · 2022-02-17T10:32:57Z

Instead, you can say that a lot of PRs with "make async" have been pushed and merged recently.

Yeah, I noticed these PRs recently as well. But are these PRs blockers for 2.10.0 release? IMO, they should not be blockers as well. I've thought they are intended to be included in Pulsar 2.11.0.

I haven't requested to block the release.

Sorry, I might missed some context. I just saw this PR in the 2.10.0 release email list. This PR should focus on the fix itself, but the previous discussion might go far for the release issue.

My valid questions were never answered by the contributors of the "make async" changes.

It's a pity to see the lack of communication. AFAIK, @Technoboy- is also preparing for a PIP to make admin APIs async. I think you should have a discussion about:

Whether this PR solve the root problem?
Based on this PR, is making admin APIs async meaningful?

BewareMyPower · 2022-02-17T11:08:39Z

I'll continue the discussion in the PIP-142 discussion email.

lhotari · 2022-02-17T11:48:40Z

My valid questions were never answered by the contributors of the "make async" changes.

It's a pity to see the lack of communication. AFAIK, @Technoboy- is also preparing for a PIP to make admin APIs async. I think you should have a discussion about:

Whether this PR solve the root problem?

Great point @BewareMyPower . I hope that the problem would first be discussed or reported before a PIP is created. @Technoboy- Would you be able to start some discussion even before the PIP is ready?

Based on this PR, is making admin APIs async meaningful?

That's also a valid question to ask. When we work together, we can learn together.

eolivelli · 2022-02-17T16:00:14Z

This is not something that can block release.
The value is already configurable.
I believe that there is no hurry in committing this change, and we can discuss about a new value or decide that the default should not be changed.

nodece · 2022-02-17T16:08:17Z

When an admin API calls the ZK metadatastore API, it gets the ZK data by call the CompletableFuture, note that we did not use the executor to execute the CompletableFuture#complete() in ZKMetadataStore.java#L171. In ZK callback thread, once the caller converts async to sync calls then the ZK callback thread will be blocked, this code so like: metadata.getAsync().get(30, TimeUnit.SECONDS).

The blocked thread in #13666 is a HTTP server thread.
"pulsar-web-40-28" #238 prio=5 os_prio=0 tid=0x00007f5a4000d800 nid=0x2bcf waiting on condition [0x00007f5961d3b000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000005c5529d40> (a java.util.concurrent.CompletableFuture$Signaller)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
	at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
	at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
	at org.apache.pulsar.broker.admin.impl.PersistentTopicsBase.internalDeleteSubscriptionForNonPartitionedTopic(PersistentTopicsBase.java:1498)
I'll clarify what I have been referring to as "sync -> async" changes: changes where the use of the blocking Servlet API is migrated to use Asynchronous Servlet API. I understand that it's necessary to not block in Zookeeper callbacks, but that is a different problem, which isn't related to Servlet API change.

When the ZK callback thread is blocked in the WEB thread, another admin API request the ZK metadata store is not working, so you see this thread stack.

github-actions · 2022-03-20T01:57:55Z

The pr had no activity for 30 days, mark with Stale label.

github-actions · 2022-05-28T02:10:46Z

The pr had no activity for 30 days, mark with Stale label.

dave2wave · 2023-07-16T19:25:45Z

@lhotari - should this old debate on a PR be closed as it was hopefully resolved?

lhotari · 2024-02-12T06:52:43Z

The Pulsar Admin client doesn't have a limit of how many connections it opens to a single broker. There is issue #22041 for addressing that.

dao-jun · 2024-02-18T17:08:38Z

I support increase the value of numHttpServerThreads, but it seems we will create a FixedThreadPoolExecutor with 200 threads?

by the way, do we still working on this PR? The PR has been blocked for 2 yrs.

lhotari added type/bug The PR fixed a bug or issue reported a bug area/broker doc-not-needed Your PR changes do not impact docs labels Feb 16, 2022

lhotari requested review from merlimat, srkukarni, eolivelli, codelipenghui and michaeljmarshall February 16, 2022 13:03

lhotari self-assigned this Feb 16, 2022

lhotari mentioned this pull request Feb 16, 2022

Make PersistentTopicsBase some methods async #14013

Closed

19 tasks

lhotari requested review from BewareMyPower, shoothzj, Technoboy- and congbobo184 February 16, 2022 13:12

eolivelli previously approved these changes Feb 16, 2022

View reviewed changes

lhotari requested a review from gaoran10 February 16, 2022 13:13

nicoloboschi previously approved these changes Feb 16, 2022

View reviewed changes

codelipenghui requested changes Feb 16, 2022

View reviewed changes

hangc0276 requested changes Feb 16, 2022

View reviewed changes

lhotari requested review from massakam and nkurihar February 16, 2022 14:17

RobertIndie requested changes Feb 16, 2022

View reviewed changes

[Proxy] Set httpNumThreads=200 for proxy.conf

ee8e67f

lhotari dismissed stale reviews from nicoloboschi and eolivelli via ee8e67f February 17, 2022 10:29

lhotari mentioned this pull request Feb 17, 2022

[Broker][Proxy][Function worker] Fix backpressure handling in Jetty web server configuration #14353

Merged

github-actions bot added the lifecycle/stale label Mar 20, 2022

lhotari changed the title ~~[Broker] Increase default numHttpServerThreads value to 200 to prevent Admin API unavailability~~ [Broker] Increase default numHttpServerThreads value to 50 to prevent Admin API unavailability Apr 8, 2022

github-actions bot added the Stale label May 28, 2022

Technoboy- added this to the 3.2.0 milestone Jul 31, 2023

Technoboy- modified the milestones: 3.2.0, 3.3.0 Dec 22, 2023

coderzc modified the milestones: 3.3.0, 3.4.0 May 8, 2024

lhotari modified the milestones: 4.0.0, 4.1.0 Oct 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Broker] Increase default numHttpServerThreads value to 50 to prevent Admin API unavailability #14320

[Broker] Increase default numHttpServerThreads value to 50 to prevent Admin API unavailability #14320

lhotari commented Feb 16, 2022 •

edited

Loading

eolivelli left a comment

nicoloboschi left a comment

codelipenghui left a comment

hangc0276 left a comment

lhotari commented Feb 16, 2022

Jason918 commented Feb 16, 2022

lhotari commented Feb 16, 2022 •

edited

Loading

lhotari commented Feb 16, 2022

RobertIndie left a comment

hangc0276 commented Feb 16, 2022

lhotari commented Feb 16, 2022

shoothzj commented Feb 16, 2022

codelipenghui commented Feb 17, 2022

BewareMyPower commented Feb 17, 2022 •

edited

Loading

nodece commented Feb 17, 2022 •

edited

Loading

lhotari commented Feb 17, 2022

lhotari commented Feb 17, 2022 •

edited

Loading

lhotari commented Feb 17, 2022

BewareMyPower commented Feb 17, 2022

BewareMyPower commented Feb 17, 2022

lhotari commented Feb 17, 2022

eolivelli commented Feb 17, 2022

nodece commented Feb 17, 2022 •

edited

Loading

github-actions bot commented Mar 20, 2022

github-actions bot commented May 28, 2022

dave2wave commented Jul 16, 2023

lhotari commented Feb 12, 2024

dao-jun commented Feb 18, 2024

[Broker] Increase default numHttpServerThreads value to 50 to prevent Admin API unavailability #14320

Are you sure you want to change the base?

[Broker] Increase default numHttpServerThreads value to 50 to prevent Admin API unavailability #14320

Conversation

lhotari commented Feb 16, 2022 • edited Loading

Motivation

Additional context

Modification

eolivelli left a comment

Choose a reason for hiding this comment

nicoloboschi left a comment

Choose a reason for hiding this comment

codelipenghui left a comment

Choose a reason for hiding this comment

hangc0276 left a comment

Choose a reason for hiding this comment

lhotari commented Feb 16, 2022

Jason918 commented Feb 16, 2022

lhotari commented Feb 16, 2022 • edited Loading

lhotari commented Feb 16, 2022

RobertIndie left a comment

Choose a reason for hiding this comment

hangc0276 commented Feb 16, 2022

lhotari commented Feb 16, 2022

shoothzj commented Feb 16, 2022

codelipenghui commented Feb 17, 2022

BewareMyPower commented Feb 17, 2022 • edited Loading

nodece commented Feb 17, 2022 • edited Loading

lhotari commented Feb 17, 2022

lhotari commented Feb 17, 2022 • edited Loading

lhotari commented Feb 17, 2022

BewareMyPower commented Feb 17, 2022

BewareMyPower commented Feb 17, 2022

lhotari commented Feb 17, 2022

eolivelli commented Feb 17, 2022

nodece commented Feb 17, 2022 • edited Loading

github-actions bot commented Mar 20, 2022

github-actions bot commented May 28, 2022

dave2wave commented Jul 16, 2023

lhotari commented Feb 12, 2024

dao-jun commented Feb 18, 2024

lhotari commented Feb 16, 2022 •

edited

Loading

lhotari commented Feb 16, 2022 •

edited

Loading

BewareMyPower commented Feb 17, 2022 •

edited

Loading

nodece commented Feb 17, 2022 •

edited

Loading

lhotari commented Feb 17, 2022 •

edited

Loading

nodece commented Feb 17, 2022 •

edited

Loading