Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky-test: NegativeAcksTest.testNegativeAcksWithBatchAckEnabled #16864

Open
Technoboy- opened this issue Jul 29, 2022 · 5 comments · Fixed by #17893
Open

Flaky-test: NegativeAcksTest.testNegativeAcksWithBatchAckEnabled #16864

Technoboy- opened this issue Jul 29, 2022 · 5 comments · Fixed by #17893

Comments

@Technoboy-
Copy link
Contributor

https://github.com/apache/pulsar/runs/7573727921?check_suite_focus=true

[INFO] Running org.apache.pulsar.client.impl.TopicPublishThrottlingInitTest
  Error:  Tests run: 69, Failures: 1, Errors: 0, Skipped: 3, Time elapsed: 109.161 s <<< FAILURE! - in org.apache.pulsar.client.impl.NegativeAcksTest
  Error:  testNegativeAcksWithBatchAckEnabled(org.apache.pulsar.client.impl.NegativeAcksTest)  Time elapsed: 11.79 s  <<< FAILURE!
  org.testng.internal.thread.ThreadTimeoutException: Method org.apache.pulsar.client.impl.NegativeAcksTest.testNegativeAcksWithBatchAckEnabled() didn't finish within the time-out 10000
  	at org.testng.internal.MethodInvocationHelper.invokeWithTimeoutWithNewExecutor(MethodInvocationHelper.java:371)
  	at org.testng.internal.MethodInvocationHelper.invokeWithTimeout(MethodInvocationHelper.java:282)
@RobertIndie
Copy link
Member

@poorbarcode
Copy link
Contributor

assign to me

congbobo184 pushed a commit that referenced this issue Sep 30, 2022
…#17893)

Fixes: #16864

### Motivation

I think it is a wrong configuration(`ackTimeout 1s`) when writing the code, the original design is set `negativeAckRedeliveryDelay 1s`

The process expects:

- send 10 messages in one batch
  - submit a batch. 
- receive 10 messages, do negative acknowledge
- after `1s`, will trigger `redelivery`
- receive 10 messages again

The real process:
- send 1 message
  - Reach the batch time limit, and submit a batch. return `msgId_1`
- send 9 messages in another batch
  - submit a batch. return `msgId_2`
- receive 10 messages, do negative acknowledge
  - push the `msgId_1` to `negativeAcksTracker`
  - push the `msgId_2` to `unAckedMessageTracker`
- after `1s`, will trigger redelivery `msgId_2` by `unAckedMessageTracker`
- receive 9 messages( `msgId_2` ) again
- after `60s`, will trigger redelivery `msgId_1` by `negativeAcksTracker`. <strong>(High light)</strong> Test execution timeout!
- receive 1 messages( `msgId_1` ) again



### Modifications

- remove conf: `ackTimeout`
- set `negativeAckRedeliveryDelay 1s`


### Documentation

- [x] `doc-not-needed` 

### Matching PR in forked repository

PR in forked repository: 

- poorbarcode#18
@michaeljmarshall
Copy link
Member

Just saw this again: https://github.com/apache/pulsar/actions/runs/4209830786/jobs/7362172483

 Error:  Tests run: 77, Failures: 1, Errors: 0, Skipped: 5, Time elapsed: 166.776 s <<< FAILURE! - in org.apache.pulsar.client.impl.NegativeAcksTest
  Error:  testNegativeAcksWithBatchAckEnabled(org.apache.pulsar.client.impl.NegativeAcksTest)  Time elapsed: 10.03 s  <<< FAILURE!
  org.testng.internal.thread.ThreadTimeoutException: Method org.apache.pulsar.client.impl.NegativeAcksTest.testNegativeAcksWithBatchAckEnabled() didn't finish within the time-out 10000
  	at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
  	at [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:341)
  	at [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:506)
  	at [email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3463)
  	at [email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3434)
  	at [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1623)
  	at app//org.apache.pulsar.common.util.collections.GrowableArrayBlockingQueue.take(GrowableArrayBlockingQueue.java:177)
  	at app//org.apache.pulsar.client.impl.ConsumerImpl.internalReceive(ConsumerImpl.java:445)
  	at app//org.apache.pulsar.client.impl.ConsumerBase.receive(ConsumerBase.java:253)
  	at app//org.apache.pulsar.client.impl.NegativeAcksTest.testNegativeAcksWithBatchAckEnabled(NegativeAcksTest.java:362)
  	at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  	at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
  	at [email protected]/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  	at [email protected]/java.lang.reflect.Method.invoke(Method.java:568)
  	at app//org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:139)
  	at app//org.testng.internal.invokers.InvokeMethodRunnable.runOne(InvokeMethodRunnable.java:47)
  	at app//org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:76)
  	at app//org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:11)
  	at [email protected]/java.util.concurrent.FutureTask.run(FutureTask.java:264)
  	at [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
  	at [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  	at [email protected]/java.lang.Thread.run(Thread.java:833)

@shibd
Copy link
Member

shibd commented Mar 1, 2023

@github-actions
Copy link

github-actions bot commented Apr 1, 2023

The issue had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Apr 1, 2023
@poorbarcode poorbarcode removed the Stale label Aug 17, 2023
lhotari pushed a commit that referenced this issue Aug 9, 2024
…#17893)

Fixes: #16864

### Motivation

I think it is a wrong configuration(`ackTimeout 1s`) when writing the code, the original design is set `negativeAckRedeliveryDelay 1s`

The process expects:

- send 10 messages in one batch
  - submit a batch.
- receive 10 messages, do negative acknowledge
- after `1s`, will trigger `redelivery`
- receive 10 messages again

The real process:
- send 1 message
  - Reach the batch time limit, and submit a batch. return `msgId_1`
- send 9 messages in another batch
  - submit a batch. return `msgId_2`
- receive 10 messages, do negative acknowledge
  - push the `msgId_1` to `negativeAcksTracker`
  - push the `msgId_2` to `unAckedMessageTracker`
- after `1s`, will trigger redelivery `msgId_2` by `unAckedMessageTracker`
- receive 9 messages( `msgId_2` ) again
- after `60s`, will trigger redelivery `msgId_1` by `negativeAcksTracker`. <strong>(High light)</strong> Test execution timeout!
- receive 1 messages( `msgId_1` ) again

### Modifications

- remove conf: `ackTimeout`
- set `negativeAckRedeliveryDelay 1s`

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository:

- poorbarcode#18

(cherry picked from commit 85b1138)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants