Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][tableview] fixed ack failure in ReaderImpl due to null messageId #17728

Merged
merged 3 commits into from
Sep 22, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@
*/
package org.apache.pulsar.client.impl;

import static org.mockito.ArgumentMatchers.any;
import static org.mockito.Mockito.spy;
import static org.mockito.Mockito.times;
import static org.mockito.Mockito.verify;
import static org.testng.Assert.assertEquals;
import static org.testng.Assert.fail;

Expand All @@ -29,7 +33,9 @@
import java.util.concurrent.TimeUnit;
import lombok.Cleanup;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang3.reflect.FieldUtils;
import org.apache.pulsar.broker.auth.MockedPulsarServiceBaseTest;
import org.apache.pulsar.client.api.MessageId;
import org.apache.pulsar.client.api.MessageRoutingMode;
import org.apache.pulsar.client.api.Producer;
import org.apache.pulsar.client.api.ProducerBuilder;
Expand All @@ -42,6 +48,7 @@
import org.testng.Assert;
import org.testng.annotations.AfterClass;
import org.testng.annotations.BeforeClass;
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;

/**
Expand Down Expand Up @@ -217,4 +224,57 @@ public void testPublishNullValue() throws Exception {
assertEquals(tv1.size(), 1);
assertEquals(tv.get("key2"), "value2");
}

@DataProvider(name = "partitionedTopic")
public static Object[][] partitioned() {
return new Object[][] {{true}, {false}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I try the test on my laptop.
Looks like the test for the partitioned topic will always get passed without the fix.
The non-partitioned topic test works fine, I see the warning logs. After applying the fix, the test can get passed.

2022-09-20T20:05:46,640 - WARN  - [pulsar-client-internal-39-1:ReaderImpl@184] - [persistent://public/default/tableview-no-partition-ack-test][reader-57f0eede9d] acknowledge message null cumulative fail.
org.apache.pulsar.client.api.PulsarClientException$InvalidMessageException: Cannot handle message with null messageId
	at org.apache.pulsar.client.impl.ConsumerBase.validateMessageId(ConsumerBase.java:358) ~[classes/:?]
	at org.apache.pulsar.client.impl.ConsumerBase.acknowledgeCumulativeAsync(ConsumerBase.java:554) ~[classes/:?]
	at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:732) ~[?:?]
	at org.mockito.internal.util.reflection.InstrumentationMemberAccessor$Dispatcher$ByteBuddy$ktU1uSvS.invokeWithArguments(Unknown Source) ~[?:?]
	at org.mockito.internal.util.reflection.InstrumentationMemberAccessor.invoke(InstrumentationMemberAccessor.java:239) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.internal.util.reflection.ModuleMemberAccessor.invoke(ModuleMemberAccessor.java:55) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.internal.creation.bytebuddy.MockMethodAdvice.tryInvoke(MockMethodAdvice.java:333) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.internal.creation.bytebuddy.MockMethodAdvice.access$500(MockMethodAdvice.java:60) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.internal.creation.bytebuddy.MockMethodAdvice$RealMethodCall.invoke(MockMethodAdvice.java:253) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.internal.invocation.InterceptedInvocation.callRealMethod(InterceptedInvocation.java:142) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.internal.stubbing.answers.CallsRealMethods.answer(CallsRealMethods.java:45) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.Answers.answer(Answers.java:99) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:110) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:34) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:82) ~[mockito-core-3.12.4.jar:?]
	at org.mockito.internal.creation.bytebuddy.MockMethodAdvice.handle(MockMethodAdvice.java:151) ~[mockito-core-3.12.4.jar:?]
	at org.apache.pulsar.client.impl.ConsumerBase.acknowledgeCumulativeAsync(ConsumerBase.java:554) ~[classes/:?]
	at org.apache.pulsar.client.impl.ReaderImpl.lambda$readNextAsync$3(ReaderImpl.java:183) ~[classes/:?]
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
	at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147) ~[?:?]
	at org.apache.pulsar.client.impl.ConsumerImpl.lambda$internalReceiveAsync$4(ConsumerImpl.java:487) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.77.Final.jar:4.1.77.Final]
	at java.lang.Thread.run(Thread.java:833) ~[?:?]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, partitioned topics should pass without this fix.

But we should see the test assert failure from the non-partitioned topic, without this fix.

Please let me know if you see different behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heesung-sn Yes, the same behavior.

}

@Test(timeOut = 30 * 1000, dataProvider = "partitionedTopic")
public void testAck(boolean partitionedTopic) throws Exception {
String topic = null;
if (partitionedTopic) {
topic = "persistent://public/default/tableview-ack-test";
admin.topics().createPartitionedTopic(topic, 3);
} else {
topic = "persistent://public/default/tableview-no-partition-ack-test";
admin.topics().createNonPartitionedTopic(topic);
}

@Cleanup
TableView<String> tv1 = pulsarClient.newTableViewBuilder(Schema.STRING)
.topic(topic)
.autoUpdatePartitionsInterval(5, TimeUnit.SECONDS)
.create();

ConsumerBase consumerBase;
if (partitionedTopic) {
MultiTopicsReaderImpl<String> reader =
((CompletableFuture<MultiTopicsReaderImpl<String>>) FieldUtils
.readDeclaredField(tv1, "reader", true)).get();
consumerBase = spy(reader.getMultiTopicsConsumer());
FieldUtils.writeDeclaredField(reader, "multiTopicsConsumer", consumerBase, true);
} else {
ReaderImpl<String> reader = ((CompletableFuture<ReaderImpl<String>>) FieldUtils
.readDeclaredField(tv1, "reader", true)).get();
consumerBase = spy(reader.getConsumer());
FieldUtils.writeDeclaredField(reader, "consumer", consumerBase, true);
}

@Cleanup
Producer<String> producer = pulsarClient.newProducer(Schema.STRING).topic(topic).create();

int msgCount = 20;
for (int i = 0; i < msgCount; i++) {
producer.newMessage().key("key:" + i).value("value" + i).send();
}

Awaitility.await()
.pollInterval(1, TimeUnit.SECONDS)
.atMost(Duration.ofMillis(5000))
.untilAsserted(()
-> verify(consumerBase, times(msgCount)).acknowledgeCumulativeAsync(any(MessageId.class)));


}
}
Original file line number Diff line number Diff line change
Expand Up @@ -177,17 +177,15 @@ public Message<T> readNext(int timeout, TimeUnit unit) throws PulsarClientExcept

@Override
public CompletableFuture<Message<T>> readNextAsync() {
CompletableFuture<Message<T>> receiveFuture = consumer.receiveAsync();
receiveFuture.whenComplete((msg, t) -> {
if (msg != null) {
consumer.acknowledgeCumulativeAsync(msg).exceptionally(ex -> {
log.warn("[{}][{}] acknowledge message {} cumulative fail.", getTopic(),
getConsumer().getSubscription(), msg.getMessageId(), ex);
return null;
});
}
return consumer.receiveAsync().thenApply(msg -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this fix work? Your error message says msg is not null but msg.getMessageId is null.

Unless I'm reading this wrong, your change is a cleaner way but the effect remains the same. Baring very unexpected behaviors (for example, consumer.acknowledgeCumulativeAsync(msg) throwing an exception, which before the change is silently disgarded.).

If msg.messageId() somehow is null, you would have the same error, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the modification is to return the CompetedFuture downstream from thenApply(msg -> {acknowledgeCumulativeAsync ) instead of receiveFuture like MultiTopicsReade.readNextAsync()

The messageId can be null after message.release().

consumer.acknowledgeCumulativeAsync(msg)
.exceptionally(ex -> {
log.error("[{}][{}] acknowledge message {} cumulative fail.", getTopic(),
getConsumer().getSubscription(), msg.getMessageId(), ex);
return null;
});
return msg;
});
return receiveFuture;
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,10 @@
import static org.testng.AssertJUnit.assertFalse;
import static org.testng.AssertJUnit.assertTrue;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.apache.commons.lang3.reflect.FieldUtils;
import org.apache.pulsar.client.api.Consumer;
import org.apache.pulsar.client.api.Message;
import org.apache.pulsar.client.api.Schema;
Expand Down Expand Up @@ -64,13 +66,20 @@ public void clean() {
}

@Test
void shouldSupportCancellingReadNextAsync() {
void shouldSupportCancellingReadNextAsync() throws IllegalAccessException {
// given
CompletableFuture<Message<byte[]>> future = reader.readNextAsync();
reader.readNextAsync();
Awaitility.await().untilAsserted(() -> {
assertTrue(reader.getConsumer().hasNextPendingReceive());
});

ConsumerBase consumer = (ConsumerBase)
FieldUtils.readDeclaredField(reader, "consumer", true);
ConcurrentLinkedQueue<CompletableFuture<Message<byte[]>>>
pendingReceives = (ConcurrentLinkedQueue<CompletableFuture<Message<byte[]>>>)
FieldUtils.readField(consumer, "pendingReceives", true);
CompletableFuture<Message<byte[]>> future = pendingReceives.peek();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break the reader.readNextAsync().cancel() behavior because users will not do as this test does. I think the reason is the future instance in the pending receives is not the same one that was returned to the user.

I have tried this one, it should work for this case.

public CompletableFuture<Message<T>> readNextAsync() {
        CompletableFuture<Message<T>> originalFuture = consumer.receiveAsync();
        CompletableFuture<Message<T>> result = originalFuture.thenApply(msg -> {
            consumer.acknowledgeCumulativeAsync(msg)
                    .exceptionally(ex -> {
                        log.error("[{}][{}] acknowledge message {} cumulative fail.", getTopic(),
                                getConsumer().getSubscription(), msg.getMessageId(), ex);
                        return null;
                    });
            return msg;
        });
        CompletableFutureCancellationHandler handler = new CompletableFutureCancellationHandler();
        handler.attachToFuture(result);
        handler.setCancelAction(() -> originalFuture.cancel(false));
        return result;
    }

Copy link
Contributor Author

@heesung-sn heesung-sn Sep 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this seems to be the better approach if we want to keep supporting the ReaderImpl.readNextAsync().cancel() behavior.

I see that the original future, multiTopicsConsumer.receiveAsync() in MultiTopicsReaderImpl.readNextAsync() is not cancellable now. Do we want to make the same change , CompletableFutureCancellationHandler handler in MultiTopicsReaderImpl.readNextAsync() too(is this another bug)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the current behavior of multiTopicsConsumer.receiveAsync().cancel() is invalid. I also fixed this part and added a test to cover multiTopicsConsumer.receiveAsync().cancel() too.

// when
future.cancel(false);

Expand Down