Null pointer exception when starting another instance for the same consumer group #35

bayan · 2015-03-04T21:05:32Z

I start multiple processes which join the same consumer group.

If there are processes that are currently consuming kafka messages while a new consumer process is spawned, often some of the existing processes panic and crash with a null pointer exception (NPE) when I try to commit the latest offset (CommitUpto).

The NPE occurs because the partition has been deleted from the zookeeperOffsetManager in the following function:

func (zom *zookeeperOffsetManager) MarkAsProcessed(topic string, partition int32, offset int64) bool {
    zom.l.RLock()
    defer zom.l.RUnlock()
    return zom.offsets[topic][partition].markAsProcessed(offset)
}

My local work around has been to add a check and short circuit:

func (zom *zookeeperOffsetManager) MarkAsProcessed(topic string, partition int32, offset int64) bool {
    zom.l.RLock()
    defer zom.l.RUnlock()
    currentOffset := zom.offsets[topic][partition]
    if currentOffset == nil {
        return false
    }
    return currentOffset.markAsProcessed(offset)
}

The text was updated successfully, but these errors were encountered:

wvanbergen · 2015-03-08T00:54:33Z

I managed to reproduce this. It's actually a pretty basic mistake - it will finalize the partition and commit the offset once it is done receiving messages from it, but that doesn't mean all the messages are processed yet - based on what your app does this can take a while.

Your workaround works, but it will basically mean that you will reprocess some messages because the offset didn't get committed. While that is OK within Kafka's at least once guarantee, I'd like to prevent tis if possible. Looking into it.

…ixes #35.

bayan · 2015-03-08T01:59:31Z

Thanks for investigating.

Yes, messages will be reprocessed with my workaround.

FWIW I was able to mitigate the redundant processing by setting the ChannelBufferSize to zero, which is ok for my own use case, since my throughput of messages is relatively low (I have less than 100 messages per minute, but each one can take many minutes to be processed).

Fixes #35

wvanbergen added a commit that referenced this issue Mar 8, 2015

Commit offsets in Close, not when shutting down a partition consumer. F…

af30637

…ixes #35.

wvanbergen added a commit that referenced this issue Mar 8, 2015

Wait for messages to be processed when finalizing a partition.

adddc9a

Fixes #35

wvanbergen mentioned this issue Mar 8, 2015

Wait for messages to be processed before committing during shutdown #37

Merged

wvanbergen added a commit that referenced this issue Mar 8, 2015

Wait for messages to be processed when finalizing a partition.

c6b7353

Fixes #35

wvanbergen closed this as completed in #37 Mar 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Null pointer exception when starting another instance for the same consumer group #35

Null pointer exception when starting another instance for the same consumer group #35

bayan commented Mar 4, 2015

wvanbergen commented Mar 8, 2015

bayan commented Mar 8, 2015

Null pointer exception when starting another instance for the same consumer group #35

Null pointer exception when starting another instance for the same consumer group #35

Comments

bayan commented Mar 4, 2015

wvanbergen commented Mar 8, 2015

bayan commented Mar 8, 2015