Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Set ServiceUnitStateChannel topic compaction threshold explicitly, improve getOwnerAsync, and fix other bugs #22064

Merged
merged 4 commits into from
Feb 22, 2024

Conversation

heesung-sn
Copy link
Contributor

@heesung-sn heesung-sn commented Feb 17, 2024

Motivation

We better set the compaction threshold of the serviceUnitStateChannel topic instead of relying on other components' init.

Also, we need to fix the issue that lookup and unload often timeout with 500 error.

Modifications

  • Set the compaction threshold of the serviceUnitStateChannel topic in the monitor thread
  • Release metadataAndPayload buffer in RawBatchMessageContainerImpl.
  • Automatically restart load data store producer and table view if inactive.
  • Updated getOwnerAsync for better synchronization of ownership checks, deferred lookups and active broker checks
  • added minor retry logic in the ExtensibleLoadManager tests

Verifying this change

  • Make sure that the change passes the CI checks.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: heesung-sn#61

@heesung-sn heesung-sn changed the title Pip 192 Set ServiceUnitStateChannel topic compaction threshold explicitly, improve getOwnerAsync, and fix other bugs [fix][broker] Set ServiceUnitStateChannel topic compaction threshold explicitly, improve getOwnerAsync, and fix other bugs Feb 17, 2024
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Feb 17, 2024
@heesung-sn heesung-sn self-assigned this Feb 17, 2024
@Technoboy- Technoboy- added this to the 3.3.0 milestone Feb 20, 2024
@Technoboy- Technoboy- merged commit 5a614e9 into apache:master Feb 22, 2024
68 of 72 checks passed
Comment on lines -261 to -263
verify(primaryLoadManager, times(1)).getBrokerSelectionStrategy();
verify(secondaryLoadManager, times(0)).getBrokerSelectionStrategy();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed that this was removed. Do you know if this was necessary? I was just preparing a fix for the test, as it was flaky: dragosvictor#13.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the selection cannot be fully controlled because the TopicPolicy topics can be also impacted during the test.

@@ -187,6 +187,7 @@ public ByteBuf toByteBuf() {
idData.writeTo(buf);
buf.writeInt(metadataAndPayload.readableBytes());
buf.writeBytes(metadataAndPayload);
metadataAndPayload.release();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heesung-sn how did you find this issue? I think this could deserve a separate PR for maintenance branches? Please report a separate issue in apache/pulsar issues to track this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, actually, I am trying to see if I can cherry-pick this PR to the branch-3.0 and others.

If not easy, I will raise separate PRs to them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just add a marker: This PR fixed broker-side memory leaks; it only affects when you enable ExtensibleLoadManager

@lhotari
Copy link
Member

lhotari commented Feb 23, 2024

@heesung-sn Is there a need to backport the ExtensibleLoadManager related fixes all the way to 3.0.x ?

@heesung-sn
Copy link
Contributor Author

@heesung-sn Is there a need to backport the ExtensibleLoadManager related fixes all the way to 3.0.x ?

Yes, I am trying to backport other PRs.

This one, too.
#22112

@heesung-sn
Copy link
Contributor Author

heesung-sn commented Feb 28, 2024

@lhotari @Demogorgon314 plz review this cherry-pick PR : #22154

heesung-sn added a commit that referenced this pull request Feb 29, 2024
…n threshold explicitly, improve getOwnerAsync, and fix other bugs (#22064) (#22154)
heesung-sn added a commit that referenced this pull request Feb 29, 2024
…n threshold explicitly, improve getOwnerAsync, and fix other bugs (#22064) (#22154)

(cherry picked from commit 6df0265)
heesung-sn added a commit to heesung-sn/pulsar that referenced this pull request Feb 29, 2024
…explicitly, improve getOwnerAsync, and fix other bugs (apache#22064)
heesung-sn added a commit that referenced this pull request Feb 29, 2024
…n threshold explicitly, improve getOwnerAsync, and fix other bugs (#22064) (#22160)
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 1, 2024
…n threshold explicitly, improve getOwnerAsync, and fix other bugs (apache#22064) (apache#22154)

(cherry picked from commit 6df0265)
(cherry picked from commit 6d2ce89)
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 6, 2024
…n threshold explicitly, improve getOwnerAsync, and fix other bugs (apache#22064) (apache#22154)

(cherry picked from commit 6df0265)
(cherry picked from commit 6d2ce89)
@heesung-sn heesung-sn deleted the pip-192-fix-compaction-threshold branch April 2, 2024 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants