[improvement][client-java] Avoid too large memory preallocation for batch message. #15033

tjiuming · 2022-04-06T02:47:56Z

Fixes (#14943)

Motivation

Allocate small memory for batch message first, and let it grow(with no memory resizes and memory copies).

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

Dependencies (does it add or upgrade a dependency): (no)
The public API: (no)
The schema: (no)
The default values of configurations: (no)
The wire protocol: (no)
The rest endpoints: (no)
The admin cli options: (no)
Anything that affects deployment: (no)

Documentation

Check the box below or label this PR directly (if you have committer privilege).

Need to update docs?

doc-required

(If you need help on updating docs, create a doc issue)
no-need-doc

(Please explain why)
doc

(If this PR contains doc changes)

github-actions · 2022-05-20T02:11:25Z

The pr had no activity for 30 days, mark with Stale label.

Jason918

LGTM

tjiuming · 2022-05-25T19:11:46Z

@codelipenghui seems the PR is able to merge

pulsar-client/src/main/java/org/apache/pulsar/client/impl/BatchMessageContainerImpl.java

github-actions · 2022-06-26T02:15:14Z

The pr had no activity for 30 days, mark with Stale label.

tjiuming · 2022-07-02T09:24:23Z

@codelipenghui Please help merge the PR

Jason918 · 2022-07-04T02:28:54Z

@codelipenghui Please help merge the PR

Please fix the CI failure @tjiuming

github-actions · 2022-09-17T04:20:02Z

@tjiuming Please provide a correct documentation label for your PR.
Instructions see Pulsar Documentation Label Guide.

AnonHxy · 2022-09-18T02:18:17Z

This patch broken org.apache.pulsar.client.impl.BatchMessageContainerImplTest.recoveryAfterOom test. Please help fix it , thanks @tjiuming

tjiuming · 2022-09-18T13:42:20Z

This patch broken org.apache.pulsar.client.impl.BatchMessageContainerImplTest.recoveryAfterOom test. Please help fix it , thanks @tjiuming

@AnonHxy Yes, I've fixed, PTAL

AnonHxy · 2022-09-19T00:25:26Z

This patch broken org.apache.pulsar.client.impl.BatchMessageContainerImplTest.recoveryAfterOom test. Please help fix it , thanks @tjiuming

@AnonHxy Yes, I've fixed, PTAL

Also have some checkstyle failure. Please make sure that CI could be pass in your own fork repo first~ @tjiuming

merlimat · 2022-09-19T19:38:41Z

@tjiuming @Jason918 @AnonHxy

Allocate small memory for batch message first, and let it grow(with no memory resizes and memory copies).

How can you say there are no memory resizes or copies?

I'm still not convinced that this change is not bringing a significant performance regression.

AnonHxy · 2022-09-20T02:30:58Z

Hi @merlimat , compositeBuffer will not lead to memory copies until the componentCount reached the maxNumComponents, which default value is 16.

If we want to avoid any memory copies, I think we could set the valule of maxNumComponents equals to batchingMaxMessages, which defalut is 1000. WDYT

Jason918 · 2022-09-20T03:24:13Z

@AnonHxy Can we do a quick test to validate this? Focus on the case with large number of messages in one batch.

AnonHxy · 2022-09-20T09:11:32Z

@AnonHxy Can we do a quick test to validate this? Focus on the case with large number of messages in one batch.

Sure

merlimat · 2022-09-20T15:30:18Z

@AnonHxy If we are writing bytes into the composite buffer, it either needs to have space into its own last buffer or to allocate a buffer and copy data into it.

tjiuming · 2022-09-20T15:47:59Z

@merlimat @AnonHxy @Jason918
For CompositeByteBuf, if the number of components in the buffer not exceeds maxComponent, it wound not resize/mem_copy. But if the capital is not enough and need a new component, all the components will merge(copy) into a new component.
Here I use the default value of maxComponent, which is 16. Because I was not sure what exact value to set.
But on the whole, the buffer will be allocated lazily, which means fewer wasted memory. To some extent, it achieves the purpose of saving memory.
To reduce mem_copy/resize, maybe we can calculate the certain value of maxComponent with maxBatchSize, is it required?

merlimat · 2022-09-20T16:36:28Z

Here I use the default value of maxComponent, which is 16. Because I was not sure what exact value to set.
But on the whole, the buffer will be allocated lazily, which means fewer wasted memory. To some extent, it achieves the purpose of saving memory.

Allocating many small buffers will have a significant large impact compared to allocating 1 single large buffer:

Netty allocator synchronization
Scattered write on the socket from many small buffers

tjiuming · 2022-09-20T17:31:48Z

@merlimat
Yes, there is allocator synchronization, but in ProducerImpl#sendAsync(Message<?> message, SendCallback callback):

         try {
         synchronized (this) {
           ......
         }
     } catch (PulsarClientException e) {
          ......
     } catch (Throwable t) {
         ......
     }

which means for a Producer, only 1 thread could send message at the same time, I think even there is allocator synchronization, it wouldn't have a significant large impact.

Scattered write on the socket from many small buffers

For this part, I don't think it will make a big difference. Yes, read bytes from CompositeByteBuf is truly slower than 1 single large buffer, because it need to calculate the readerIndex, but when write it to channel, CompositeByteBuf will transform to ByteBuffer[], it would not too much slower than ByteBuffer.

I think the only problem is when the batch is too large, it will lead to mem_copy/resize.

AnonHxy · 2022-09-25T15:54:31Z

I have tested this feature using standalone cluster. It seems that this PR may bring a performance regression. The reason is that:
When using ByteBufAllocator#buffer(int), the Bytebuf writing took only a little cpu time:

And when we using ByteBufAllocator#compositeBuffer, the Bytebuf wrte took more cpu time:

I think another solution is that we can introduce a way to shrink the maxBatchSize. For example, if current batch size is less that 3/4 * maxBatchSize we will let maxBatchSize = 3/4 * maxBatchSize

@merlimat @tjiuming @Jason918 @codelipenghui

Jason918 · 2022-09-26T07:23:45Z

@AnonHxy Is there a significant throughput difference between these two?

AnonHxy · 2022-09-26T09:08:37Z

Is there a significant throughput difference between these two?

OK. @Jason918

First I setup up a standalone cluster using apache-pulsar-2.10.1
Then started a producer like below to publish messages, starting with command ~/houxiaoyu/jdk-17.0.4/bin/java -Xms2000m -Xmx2000m -XX:MaxDirectMemorySize=4000m -jar pulsar-buffer.jar:

  public static void main(String[] args) throws Exception {
        String topic = "hxy-test-mem";
        byte[] smallMsg = new byte[64];
        PulsarClient client = PulsarClient.builder()
                .serviceUrl("pulsar://127.0.0.1:6650")
                .memoryLimit(2000, SizeUnit.MEGA_BYTES)  // memory limit 2000M
                .build();
        Producer<byte[]> producer = client.newProducer()
                .topic(topic)
                .enableBatching(true)
                .create();
        for (int i = 0; i < Integer.MAX_VALUE; ++i) {
            producer.sendAsync(smallMsg);
        }
    }

I have compared the publish throughput three times, and all of them came to the same conclusion that throughput will be descended when using compositeBuffer. Here is the result：

buffer(msg/s): means before applying this patch
compositeBuffer(msg/s): means after applying this patch

	buffer (msg/s)	compositeBuffer (msg/s)
first test	1,034,217.59	930,746.97
	1,024,092.14	939,542.21
	1,024,092.14	940,831.47
second test	977,871.08	907,644.68
	972,081.04	903,371.45
	1,000,225.66	903,853.17
third test	1,031,195.02	890,254.75
	1,036,956.93	890,347.82
	1,024,760.17	889,036.17
avg	1,013,943.52	910,625.4

AnonHxy · 2022-09-26T09:21:51Z

And here is the attachment of Flame Graph for the first time test
buffer.html.zip
compositeBuffer.html.zip

avoid allocate too large memory for batch message.

ce8bdd0

github-actions bot assigned tjiuming Apr 6, 2022

github-actions bot added the doc-not-needed Your PR changes do not impact docs label Apr 6, 2022

codelipenghui approved these changes Apr 19, 2022

View reviewed changes

codelipenghui added this to the 2.11.0 milestone Apr 19, 2022

codelipenghui assigned tjiuming and unassigned tjiuming Apr 19, 2022

codelipenghui requested review from 315157973, eolivelli, hangc0276, Jason918, merlimat, BewareMyPower and gaoran10 April 19, 2022 14:06

github-actions bot added the lifecycle/stale label May 20, 2022

Jason918 approved these changes May 25, 2022

View reviewed changes

merlimat reviewed May 25, 2022

View reviewed changes

pulsar-client/src/main/java/org/apache/pulsar/client/impl/BatchMessageContainerImpl.java Outdated Show resolved Hide resolved

github-actions bot removed the lifecycle/stale label May 26, 2022

315157973 approved these changes May 26, 2022

View reviewed changes

github-actions bot added the Stale label Jun 26, 2022

github-actions bot removed the Stale label Jul 3, 2022

codelipenghui modified the milestones: 2.11.0, 2.12.0 Jul 26, 2022

dao-jun added 2 commits August 2, 2022 01:50

Merge branch 'master' into dev/batch_msg_allocator

8aefe89

fix tests

8eb2c9f

github-actions bot added doc-label-missing and removed doc-not-needed Your PR changes do not impact docs labels Sep 17, 2022

Jason918 added doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Sep 17, 2022

github-actions bot added doc-label-missing and removed doc-not-needed Your PR changes do not impact docs labels Sep 17, 2022

fix test

d74e1fe

AnonHxy closed this Sep 19, 2022

AnonHxy reopened this Sep 19, 2022

fix checkstyle

94d90e1

AnonHxy merged commit 996666a into apache:master Sep 19, 2022

lhotari mentioned this pull request Sep 19, 2022

lh reduce memory usage stub only mocks spies lhotari/pulsar#85

Closed

tjiuming deleted the dev/batch_msg_allocator branch September 19, 2022 07:51

AnonHxy mentioned this pull request Sep 27, 2022

[improve][java-client]Shrink BatchMessageContainer maxBatchSize #17854

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[improvement][client-java] Avoid too large memory preallocation for batch message. #15033

[improvement][client-java] Avoid too large memory preallocation for batch message. #15033

tjiuming commented Apr 6, 2022

github-actions bot commented May 20, 2022

Jason918 left a comment

tjiuming commented May 25, 2022

github-actions bot commented Jun 26, 2022

tjiuming commented Jul 2, 2022

Jason918 commented Jul 4, 2022 •

edited

Loading

github-actions bot commented Sep 17, 2022

AnonHxy commented Sep 18, 2022

tjiuming commented Sep 18, 2022

AnonHxy commented Sep 19, 2022

merlimat commented Sep 19, 2022

AnonHxy commented Sep 20, 2022 •

edited

Loading

Jason918 commented Sep 20, 2022

AnonHxy commented Sep 20, 2022

merlimat commented Sep 20, 2022

tjiuming commented Sep 20, 2022 •

edited

Loading

merlimat commented Sep 20, 2022

tjiuming commented Sep 20, 2022 •

edited

Loading

AnonHxy commented Sep 25, 2022

Jason918 commented Sep 26, 2022

AnonHxy commented Sep 26, 2022 •

edited

Loading

AnonHxy commented Sep 26, 2022 •

edited

Loading

[improvement][client-java] Avoid too large memory preallocation for batch message. #15033

[improvement][client-java] Avoid too large memory preallocation for batch message. #15033

Conversation

tjiuming commented Apr 6, 2022

Motivation

Does this pull request potentially affect one of the following parts:

Documentation

github-actions bot commented May 20, 2022

Jason918 left a comment

Choose a reason for hiding this comment

tjiuming commented May 25, 2022

github-actions bot commented Jun 26, 2022

tjiuming commented Jul 2, 2022

Jason918 commented Jul 4, 2022 • edited Loading

github-actions bot commented Sep 17, 2022

AnonHxy commented Sep 18, 2022

tjiuming commented Sep 18, 2022

AnonHxy commented Sep 19, 2022

merlimat commented Sep 19, 2022

AnonHxy commented Sep 20, 2022 • edited Loading

Jason918 commented Sep 20, 2022

AnonHxy commented Sep 20, 2022

merlimat commented Sep 20, 2022

tjiuming commented Sep 20, 2022 • edited Loading

merlimat commented Sep 20, 2022

tjiuming commented Sep 20, 2022 • edited Loading

AnonHxy commented Sep 25, 2022

Jason918 commented Sep 26, 2022

AnonHxy commented Sep 26, 2022 • edited Loading

AnonHxy commented Sep 26, 2022 • edited Loading

Jason918 commented Jul 4, 2022 •

edited

Loading

AnonHxy commented Sep 20, 2022 •

edited

Loading

tjiuming commented Sep 20, 2022 •

edited

Loading

tjiuming commented Sep 20, 2022 •

edited

Loading

AnonHxy commented Sep 26, 2022 •

edited

Loading

AnonHxy commented Sep 26, 2022 •

edited

Loading