Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][ml] Fix NPE of getValidPositionAfterSkippedEntries when recovering a terminated managed ledger #22552

Merged
merged 2 commits into from
Apr 22, 2024

Conversation

coderzc
Copy link
Member

@coderzc coderzc commented Apr 22, 2024

Motivation

Since managedLedger does not create a new ledger when recovering a terminated managed ledger, lead to getValidPositionAfterSkippedEntries will get an NPE.

org.apache.bookkeeper.mledger.ManagedLedgerException$MetaStoreException: java.util.concurrent.CompletionException: java.lang.NullPointerException: Cannot invoke "org.apache.bookkeeper.client.LedgerHandle.getId()" because "this.currentLedger" is null
Caused by: java.util.concurrent.CompletionException: java.lang.NullPointerException: Cannot invoke "org.apache.bookkeeper.client.LedgerHandle.getId()" because "this.currentLedger" is null
	at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:722) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482) ~[?:?]
	at org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run(OrderedExecutor.java:201) [bookkeeper-common-4.16.4.jar:4.16.4]
	at org.apache.bookkeeper.common.util.SingleThreadSafeScheduledExecutorService$SafeRunnable.run(SingleThreadSafeScheduledExecutorService.java:46) [bookkeeper-common-4.16.4.jar:4.16.4]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.108.Final.jar:4.1.108.Final]
	at java.base/java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.lang.NullPointerException: Cannot invoke "org.apache.bookkeeper.client.LedgerHandle.getId()" because "this.currentLedger" is null

if (state == State.Terminated) {
// When recovering a terminated managed ledger, we don't need to create
// a new ledger for writing, since no more writes are allowed.
// We just move on to the next stage
initializeCursors(callback);
return;
}

public PositionImpl getValidPositionAfterSkippedEntries(final PositionImpl position, int skippedEntryNum) {
PositionImpl skippedPosition = position.getPositionAfterEntries(skippedEntryNum);
while (!isValidPosition(skippedPosition)) {
Long nextLedgerId = ledgers.ceilingKey(skippedPosition.getLedgerId() + 1);
// This means it has jumped to the last position
if (nextLedgerId == null) {
if (currentLedgerEntries == 0) {
return PositionImpl.get(currentLedger.getId(), 0);
}
return lastConfirmedEntry.getNext();
}
skippedPosition = PositionImpl.get(nextLedgerId, 0);
}
return skippedPosition;
}

this NPE introduced from #22034

Modifications

If currentLedger == null, then return lastConfirmedEntry.getNext() for getValidPositionAfterSkippedEntries to avoid NPE.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Apr 22, 2024
lhotari
lhotari previously approved these changes Apr 22, 2024
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari lhotari dismissed their stale review April 22, 2024 13:26

Test needs addressing the comment raised in a review by shibd

@coderzc coderzc force-pushed the fix_getNextValidPosition_NPE branch from cef9594 to 0f4dab4 Compare April 22, 2024 13:32
@coderzc coderzc force-pushed the fix_getNextValidPosition_NPE branch from 0f4dab4 to 63ade35 Compare April 22, 2024 13:33
@coderzc coderzc requested review from lhotari and shibd April 22, 2024 14:00
@coderzc coderzc self-assigned this Apr 22, 2024
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari lhotari merged commit 35599b7 into apache:master Apr 22, 2024
54 of 55 checks passed
lhotari pushed a commit that referenced this pull request Apr 22, 2024
…ing a terminated managed ledger (#22552)

(cherry picked from commit 35599b7)
lhotari pushed a commit that referenced this pull request Apr 22, 2024
…ing a terminated managed ledger (#22552)

(cherry picked from commit 35599b7)
lhotari pushed a commit that referenced this pull request Apr 22, 2024
…ing a terminated managed ledger (#22552)

(cherry picked from commit 35599b7)
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Apr 23, 2024
…ing a terminated managed ledger (apache#22552)

(cherry picked from commit 35599b7)
(cherry picked from commit def695b)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Apr 23, 2024
…ing a terminated managed ledger (apache#22552)

(cherry picked from commit 35599b7)
(cherry picked from commit def695b)
@Technoboy- Technoboy- added this to the 3.3.0 milestone Apr 24, 2024
Technoboy- pushed a commit to Technoboy-/pulsar that referenced this pull request Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants