Skip to content
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.

ISSUE-6869: Negative acknowledgement doesn't remove the message id from UnAckedMessageTracker when message id is instance of BatchMessageIdImpl #939

Open
sijie opened this issue May 4, 2020 · 0 comments

Comments

@sijie
Copy link
Member

sijie commented May 4, 2020

Original Issue: apache#6869


Describe the bug

The actual symptom was that when using the DLQ feature, the redelivery counts were not consistent in a use case where negative acknowledgements are used. Messages would get redelivered more times than the configured maxRedeliverCount on the DeadLetterPolicy.

I observed this type of log messages in the log output:

14:20:07.080 [pulsar-timer-4-1] WARN  o.a.p.c.impl.UnAckedMessageTracker - [ConsumerBase{subscription='Test-Subscriber', consumerName='f194e', topic='test-topic'}] 5 messages have timed-out

By debugging, I noticed that calling org.apache.pulsar.client.api.Consumer#negativeAcknowledge(org.apache.pulsar.client.api.MessageId) doesn't remove the message id from UnAckedMessageTracker when the message is instance of BatchMessageIdImpl.

Expected behavior

org.apache.pulsar.client.impl.UnAckedMessageTracker implementation should encapsulate the fact that the message id must be MessageIdImpl and not BatchMessageIdImpl.

Currently the logic to first convert a BatchMessageIdImpl is done on the calling side (examples: https://github.com/apache/pulsar/blob/de57ddd572dbc74529a56fee68c6be37bd35cf7c/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L521-L531 , https://github.com/apache/pulsar/blob/de57ddd572dbc74529a56fee68c6be37bd35cf7c/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1155-L1164 )

Since the caller has to convert the message id before calling UnAckedMessageTracker add or remove methods, it seems that this leads to error prone usage of the UnAckedMessageTracker class. Currently the conversion to MessageIdImpl is missing in the negative acknowledgement method:
https://github.com/apache/pulsar/blob/de57ddd572dbc74529a56fee68c6be37bd35cf7c/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L556-L557

Work around

One workaround is to convert a possible BatchMessageIdImpl to MessageIdImpl before calling the negativeAcknowledge method. Something like this

   ...
   consumer.negativeAcknowledge(convertMessageIdForNack(message.getMessageId()));
   ...

    // workaround Pulsar bug regarding negative acknowledgements
    private static MessageId convertMessageIdForNack(MessageId messageId) {
        if (messageId instanceof BatchMessageIdImpl) {
            // use similar logic as there is in org.apache.pulsar.client.impl.NegativeAcksTracker#add
            BatchMessageIdImpl batchMessageId = (BatchMessageIdImpl) messageId;
            return new MessageIdImpl(batchMessageId.getLedgerId(), batchMessageId.getEntryId(),
                    batchMessageId.getPartitionIndex());
        } else {
            return messageId;
        }
    }
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants