Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dead letters about ReplicationRegion$Passivate continue #199

Closed
xirc opened this issue Apr 5, 2023 · 0 comments · Fixed by #200
Closed

Dead letters about ReplicationRegion$Passivate continue #199

xirc opened this issue Apr 5, 2023 · 0 comments · Fixed by #200
Labels
bug Something isn't working
Milestone

Comments

@xirc
Copy link
Contributor

xirc commented Apr 5, 2023

Dead letters about ReplicationRegion$Passivate like the below continue:

14:11:07.307 INFO    ip-***-2   akka.actor.LocalActorRef       system--akka.actor.default-dispatcher-25 Message [lerna.akka.entityreplication.ReplicationRegion$Passivate] to Actor[akka://my-system/system/sharding/raft-shard-***-replica-group-3/38/38#-115359973] was unhandled. [4] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

How to reproduce

Checkout lerna-sample-account-app v2022.12.0

https://github.com/lerna-stack/lerna-sample-account-app/tree/v2022.12.0

2. Change some code and configurations

diff --git a/app/application/src/main/scala/myapp/application/account/BankAccountBehavior.scala b/app/application/src/main/scala/myapp/application/account/BankAccountBehavior.scala
index 9282451..529b879 100644
--- a/app/application/src/main/scala/myapp/application/account/BankAccountBehavior.scala
+++ b/app/application/src/main/scala/myapp/application/account/BankAccountBehavior.scala
@@ -251,8 +251,10 @@ object BankAccountBehavior extends AppTypedActorLogging {
         case GetBalance(replyTo) =>
           Effect.reply(replyTo)(AccountBalance(balance))
         case ReceiveTimeout() =>
+          println(s"BankAccountBehavior(${accountNo.value}) is passivating.")
           Effect.passivate().thenNoReply()
         case Stop() =>
+          println(s"BankAccountBehavior(${accountNo.value}) stopped.")
           Effect.stopLocally()
       }

@@ -386,7 +388,8 @@ object BankAccountBehavior extends AppTypedActorLogging {
       // This is highly recommended to identify the source of log outputs
       context.setLoggerName(BankAccountBehavior.getClass)
       // ReceiveTimeout will trigger Effect.passivate()
-      context.setReceiveTimeout(1.minute, ReceiveTimeout())
+      context.setReceiveTimeout(5.seconds, ReceiveTimeout())
+      println(s"BankAccountBehavior(${entityContext.entityId}) is starting.")
       ReplicatedEntityBehavior[Command, DomainEvent, Account](
         entityContext,
         emptyState = Account(
diff --git a/app/entrypoint/src/main/resources/application.conf b/app/entrypoint/src/main/resources/application.conf
index cad02d5..36e4604 100644
--- a/app/entrypoint/src/main/resources/application.conf
+++ b/app/entrypoint/src/main/resources/application.conf
@@ -29,8 +29,12 @@ myapp {
 akka {
   actor {
     provider = "cluster"
+    debug.unhandled = on
   }

+  log-dead-letters = on
+  log-dead-letters-suspend-duration = 30 seconds
+
   remote {
     artery {
       canonical {
diff --git a/app/utility/src/main/resources/logback.xml b/app/utility/src/main/resources/logback.xml
index 510896e..820c8de 100644
--- a/app/utility/src/main/resources/logback.xml
+++ b/app/utility/src/main/resources/logback.xml
@@ -8,7 +8,8 @@

 <!--    <logger level="DEBUG" name="lerna.akka.entityreplication" />-->
     <logger level="INFO" name="myapp" />
-    <logger level="INFO" name="akka" />
+    <logger level="DEBUG" name="akka" />
+    <logger level="INFO" name="akka.cluster" />

     <root level="WARN">
         <appender-ref ref="STDOUT"/>
diff --git a/scripts/start-app-1.sh b/scripts/start-app-1.sh
index ffdf39d..996c2fc 100644
--- a/scripts/start-app-1.sh
+++ b/scripts/start-app-1.sh
@@ -21,4 +21,7 @@ sbt \
 -Dmyapp.readmodel.rdbms.tenants.tenant-b.db.url='jdbc:mysql://127.0.0.2:3306/myapp-tenant-b' \
 -Dmyapp.readmodel.rdbms.tenants.tenant-b.db.user='dbuser_b' \
 -Dmyapp.readmodel.rdbms.tenants.tenant-b.db.password='dbpass@b' \
+-Dlerna.akka.entityreplication.raft.compaction.log-size-threshold=30 \
+-Dlerna.akka.entityreplication.raft.compaction.preserve-log-size=3 \
+-Dlerna.akka.entityreplication.raft.compaction.log-size-check-interval=10s \
 entrypoint/run
diff --git a/scripts/start-app-2.sh b/scripts/start-app-2.sh
index c5f9a72..d895f26 100644
--- a/scripts/start-app-2.sh
+++ b/scripts/start-app-2.sh
@@ -17,4 +17,7 @@ sbt \
 -Dmyapp.readmodel.rdbms.tenants.tenant-b.db.url='jdbc:mysql://127.0.0.2:3306/myapp-tenant-b' \
 -Dmyapp.readmodel.rdbms.tenants.tenant-b.db.user='dbuser_b' \
 -Dmyapp.readmodel.rdbms.tenants.tenant-b.db.password='dbpass@b' \
+-Dlerna.akka.entityreplication.raft.compaction.log-size-threshold=20 \
+-Dlerna.akka.entityreplication.raft.compaction.preserve-log-size=3 \
+-Dlerna.akka.entityreplication.raft.compaction.log-size-check-interval=10s \
 entrypoint/run
diff --git a/scripts/start-app-3.sh b/scripts/start-app-3.sh
index bc7262e..6a152af 100644
--- a/scripts/start-app-3.sh
+++ b/scripts/start-app-3.sh
@@ -17,4 +17,7 @@ sbt \
 -Dmyapp.readmodel.rdbms.tenants.tenant-b.db.url='jdbc:mysql://127.0.0.2:3306/myapp-tenant-b' \
 -Dmyapp.readmodel.rdbms.tenants.tenant-b.db.user='dbuser_b' \
 -Dmyapp.readmodel.rdbms.tenants.tenant-b.db.password='dbpass@b' \
+-Dlerna.akka.entityreplication.raft.compaction.log-size-threshold=10 \
+-Dlerna.akka.entityreplication.raft.compaction.preserve-log-size=3 \
+-Dlerna.akka.entityreplication.raft.compaction.log-size-check-interval=10s \
 entrypoint/run

3. Run lerna-sample-account-app

3. Run the following script to make HTTP requests

This script doesn't stop automatically.
After a certain time (around 30s~60s), please stop the script.

#!/usr/bin/env bash
set -e
while :
do
  curl \
      --silent \
      --show-error \
      --request 'POST' \
      --header 'X-Tenant-Id: tenant-a' \
      --url "http://127.0.0.1:9001/accounts/$(date '+%s')/deposit?transactionId=$(date '+%s')&amount=100"
  sleep 0.5s
done

Some of the nodes (apps) log dead letters as below, and continue that:

BankAccountBehavior(1680662388) is passivating.
BankAccountBehavior(1680662432) is passivating.
2023-04-05 12:08:44.184 INFO    akka.actor.LocalActorRef        -       -       -       Message [lerna.akka.entityreplication.ReplicationRegion$Passivate] to Actor[akka://MyAppSystem/system/sharding/raft-shard-BankAccount-tenant-a-replica-group-3/74/74#-934970431] was unhandled. [1004] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

2023-04-05 12:08:44.185 INFO    akka.actor.LocalActorRef        -       -       -       Message [lerna.akka.entityreplication.ReplicationRegion$Passivate] to Actor[akka://MyAppSystem/system/sharding/raft-shard-BankAccount-tenant-a-replica-group-3/74/74#-934970431] was unhandled. [1005] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

Possible Causes

An entity on a follower is running even after the corresponding entity on the leader was passivated. This could happen in the following scenarios:

  • A follower's Raft log compaction starts an entity.
  • A follower's event replay (of Akka Persistence) starts an entity.

Possible solutions

  • RaftActor will passivate entities after Raft log compactions and event replay of Akka Persistence.
  • Non-leaders (followers and candidates) handle ReplicationRegion$Passivate such that the sender entity will be passivated eventually.
@xirc xirc added this to the v2.3.0 milestone May 9, 2023
@xirc xirc added the bug Something isn't working label May 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant