sim-lib: continue simulation on send payment errors #104

okjodom · 2023-09-15T23:29:09Z

Instead of stopping the simulation on the first send payment rpc error, we gracefully handle the error and report a default send payment result. Since send payment rpc might fail before dispatching a payment (i.e in CLN case), we allow activity result reporting to skip tracking of payment if there's no payment hash

Sample results with failed keysends. The simulation continues until stopped

source,destination,hash,amount_msat,dispatch_time,htlc_count,payment_outcome
023d45dc57ed99e6d793e2fed8dcf79d2f6ad2f32ec31b48102de445e245e08336,03b73874dde2c0af24619f03b6fc5d36adbf00ad6d25ce6479d4f5b60f648c8cb6,83d340ab8a7f7b383294dd06a8fc02bcbff6c77c73debaa8e590aad550216f98,1000,1695160575397,1,Success
02c0c854f511082f9d174f9b0e51ca14fc299c25de9d65dddadebe9ac1fd529758,03b73874dde2c0af24619f03b6fc5d36adbf00ad6d25ce6479d4f5b60f648c8cb6,Unknown,10000000000,1695160576400,0,Unknown

TODO:

Continue simulation on send payment errors
Formalize terminal error handling
Parse CLN send_payment rpc errors to determine terminal failures
Parse LND send_payment rpc errors to determine terminal failures

fixes #96

carlaKC

Thanks for picking this up! Sorry that I gave you rebase conflicts :')

sim-lib/src/cln.rs

carlaKC

Approach looks good!

I do think that we need to take a slightly more subtle approach to handling errors (right now we'll just create failed events for any send_payment failure). In my mind there are two types of errors:

Terminal errors: the lightning node has shut down -> we can't recover and should also shut down.
Payment errors: the API has returned an error, but it's actually just a pathfinding error (eg: no route) -> we can recover from this, and should record the failure and try again.

Right now, if one of the lightning nodes shuts down we'll just continue to track failed payments when we should really just shut down. To address #96, we need to examine the error from CLN and figure out whether it's terminal or now. So basically all of this code as-is, with a little some more precise error handling in the CLN impl.

sim-lib/src/lib.rs

okjodom · 2023-09-21T14:52:27Z

I do think that we need to take a slightly more subtle approach to handling errors (right now we'll just create failed events for any send_payment failure). In my mind there are two types of errors:

Terminal errors: the lightning node has shut down -> we can't recover and should also shut down.

Payment errors: the API has returned an error, but it's actually just a pathfinding error (eg: no route) -> we can recover from this, and should record the failure and try again.

@carlaKC I agree with your sentiment above, but have a slightly different proposal regarding interpretation of Terminal errors

RE: Terminal errors: the lightning node has shut down -> we can't recover and should also shut down.

Instead of shutting down the entire simulation, I propose we stop simulating the activity or activity set affected by nodes we have seen terminal errors on. This shuts down parts of the simulation, but continues the still good parts.

Right now, if one of the lightning nodes shuts down we'll just continue to track failed payments when we should really just shut down. To address #96, we need to examine the error from CLN and figure out whether it's terminal or now. So basically all of this code as-is, with a little some more precise error handling in the CLN impl.

So building on this PR, my proposal is we stop simulating activity related to the CLN node if we observe terminal failure on the node.

Oh, and we terminate the simulation if there's no more activities to simulate.

sim-lib/src/lib.rs

carlaKC

Just read through the shutdown strategy commit!

sim-lib/src/lib.rs

carlaKC · 2023-09-26T13:24:45Z

f89506a looking nice 👌 thanks for going through a few iterations to get it there!

I you want to go ahead and fill in the parsing I think we can do this all in one PR?

okjodom · 2023-09-26T13:36:57Z

f89506a looking nice 👌 thanks for going through a few iterations to get it there!

I you want to go ahead and fill in the parsing I think we can do this all in one PR?

I was planning the parsing as follow-ups when I get a chance to build on this.

Starting w cln because that's the most offending

carlaKC · 2023-09-26T13:54:31Z

I was planning the parsing as follow-ups when I get a chance to build on this.

SGTM, no strong feelings.

Starting w cln because that's the most offending

GLHF :')

sr-gi

Looks pretty good overall. Check some comments inline.

f5e9eaa and 5bae876 should be squashable with f89506a

sim-lib/src/cln.rs

sim-lib/src/lib.rs

sim-lib/src/cln.rs

carlaKC

tACK against the error originally described in #96

 ERROR [sim_lib] Error while sending payment 0327b7826a232ba4c7032f659ae0b65528f9b9f3191c54cbea423a207cff150029 -> 0346b6eeeaa8f13a64e4bff924a2cb9db94e01b3cc1819970c1f3e8cb4eabb3f5f

Simulation output:

0327b7826a232ba4c7032f659ae0b65528f9b9f3191c54cbea423a207cff150029,0346b6eeeaa8f13a64e4bff924a2cb9db94e01b3cc1819970c1f3e8cb4eabb3f5f,2000,Unknown,1696256242579,0,Unknown

LGTM pending @sr-gi's comments. Nothing blocking in my review except for shutting down if we can't send a payment result in produce_simulation_results (I'm also happy to pick up in followup, as shutdown hygiene needs some general work IMO).

sim-lib/src/lib.rs

carlaKC · 2023-10-02T14:01:19Z

sim-lib/src/lib.rs

+                                if results.clone().send((payment, result)).await.is_err() {
+                                    log::debug!("Could not send payment result");
+                                }


This is pre-existing, but I think we should break if we can't send into the results channel? Because something has exit downstream.

Note to future selves: We don't currently do this for trackpayment but I don't think that's actually correct. I want to do a run-through of all our shutdown logic because I think using channels has made it overly complex, so can deal with trackpayment separately.

made the same note

This is pre-existing, but I think we should break if we can't send into the results channel? Because something has exit downstream.

I think we should only break out of these listeners if shutdown is called. If something exits with finality, it should trigger shutdown, which leads to break(s), which can lead to channel send fail (because we bias to shutdowns?), but we log these send failures like we do here

sim-lib/src/lib.rs

sim-lib/src/cln.rs

sr-gi

A few comments, mostly nits, LGTM otherwise

sim-lib/src/lib.rs

sr-gi · 2023-10-02T20:11:53Z

sim-lib/src/lib.rs

                                set.spawn(track_payment_result(
-                                    source_node,results.clone(),simulation_output, shutdown.clone(),
+                                    source_node,results.clone(),payment, shutdown.clone(),


Not sure why fmt is not catching this, but there are some spaces missing here.

fmt is unreliable in select :(

hm big facepalm

instead of stopping the simulation on the first send payment rpc error, we gracefully handle the error and report a default send payment result. since send payment rpc might fail before dispatching a payment (i.e in CLN case), we allow activity result reporting to skip tracking of payment if there's no payment hash

carlaKC

tACK fb981a2 🎉

sr-gi · 2023-10-03T19:38:25Z

ACK fb981a2

okjodom force-pushed the continue-sim branch from 1766ffd to bacc835 Compare September 19, 2023 21:58

okjodom requested a review from carlaKC September 19, 2023 22:00

okjodom marked this pull request as ready for review September 19, 2023 22:00

carlaKC reviewed Sep 20, 2023

View reviewed changes

sim-lib/src/cln.rs Outdated Show resolved Hide resolved

okjodom force-pushed the continue-sim branch from bacc835 to 0615f4b Compare September 21, 2023 01:26

carlaKC reviewed Sep 21, 2023

View reviewed changes

sim-lib/src/lib.rs Show resolved Hide resolved

carlaKC reviewed Sep 21, 2023

View reviewed changes

sim-lib/src/lib.rs Outdated Show resolved Hide resolved

okjodom force-pushed the continue-sim branch from 0615f4b to 988f908 Compare September 21, 2023 15:28

okjodom commented Sep 22, 2023

View reviewed changes

sim-lib/src/lib.rs Outdated Show resolved Hide resolved

carlaKC reviewed Sep 25, 2023

View reviewed changes

sim-lib/src/lib.rs Outdated Show resolved Hide resolved

sim-lib/src/lib.rs Outdated Show resolved Hide resolved

okjodom force-pushed the continue-sim branch 2 times, most recently from e2b66c8 to f89506a Compare September 25, 2023 22:02

okjodom marked this pull request as draft September 26, 2023 18:38

okjodom force-pushed the continue-sim branch 2 times, most recently from d433227 to f5e9eaa Compare September 28, 2023 19:48

okjodom marked this pull request as ready for review September 28, 2023 19:49

sr-gi requested changes Sep 29, 2023

View reviewed changes

carlaKC reviewed Oct 2, 2023

View reviewed changes

okjodom force-pushed the continue-sim branch 2 times, most recently from 6993085 to 37652cf Compare October 2, 2023 18:57

sr-gi reviewed Oct 2, 2023

View reviewed changes

okjodom force-pushed the continue-sim branch from 37652cf to 4e7b761 Compare October 2, 2023 21:12

okjodom requested review from carlaKC and sr-gi October 3, 2023 13:41

okjodom force-pushed the continue-sim branch from 4e7b761 to fb981a2 Compare October 3, 2023 19:26

carlaKC approved these changes Oct 3, 2023

View reviewed changes

sr-gi approved these changes Oct 3, 2023

View reviewed changes

carlaKC merged commit bc0f8ed into bitcoin-dev-project:main Oct 4, 2023

carlaKC mentioned this pull request Oct 4, 2023

Random Activity Generator #113

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sim-lib: continue simulation on send payment errors #104

sim-lib: continue simulation on send payment errors #104

okjodom commented Sep 15, 2023 •

edited

Loading

carlaKC left a comment

carlaKC left a comment

okjodom commented Sep 21, 2023 •

edited

Loading

carlaKC left a comment

carlaKC commented Sep 26, 2023

okjodom commented Sep 26, 2023 •

edited

Loading

carlaKC commented Sep 26, 2023

sr-gi left a comment

carlaKC left a comment •

edited

Loading

carlaKC Oct 2, 2023

okjodom Oct 2, 2023

okjodom Oct 2, 2023

sr-gi left a comment

sr-gi Oct 2, 2023

carlaKC Oct 3, 2023

okjodom Oct 3, 2023 •

edited

Loading

carlaKC left a comment

sr-gi commented Oct 3, 2023

sim-lib: continue simulation on send payment errors #104

sim-lib: continue simulation on send payment errors #104

Conversation

okjodom commented Sep 15, 2023 • edited Loading

carlaKC left a comment

Choose a reason for hiding this comment

carlaKC left a comment

Choose a reason for hiding this comment

okjodom commented Sep 21, 2023 • edited Loading

carlaKC left a comment

Choose a reason for hiding this comment

carlaKC commented Sep 26, 2023

okjodom commented Sep 26, 2023 • edited Loading

carlaKC commented Sep 26, 2023

sr-gi left a comment

Choose a reason for hiding this comment

carlaKC left a comment • edited Loading

Choose a reason for hiding this comment

carlaKC Oct 2, 2023

Choose a reason for hiding this comment

okjodom Oct 2, 2023

Choose a reason for hiding this comment

okjodom Oct 2, 2023

Choose a reason for hiding this comment

sr-gi left a comment

Choose a reason for hiding this comment

sr-gi Oct 2, 2023

Choose a reason for hiding this comment

carlaKC Oct 3, 2023

Choose a reason for hiding this comment

okjodom Oct 3, 2023 • edited Loading

Choose a reason for hiding this comment

carlaKC left a comment

Choose a reason for hiding this comment

sr-gi commented Oct 3, 2023

okjodom commented Sep 15, 2023 •

edited

Loading

okjodom commented Sep 21, 2023 •

edited

Loading

okjodom commented Sep 26, 2023 •

edited

Loading

carlaKC left a comment •

edited

Loading

okjodom Oct 3, 2023 •

edited

Loading