UNL Member Operational Diagnostics #213
mtrippled
started this conversation in
Ideas (pre standard proposal)
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This discussion is intended to help resolve a recurring issue in XRP Ledger operation: intermittent UNL node instability. The result of this, if spread beyond quorum thresholds, is network delays in processing transactions. Generally, the most effective way to diagnose this type of issue is through analysis of the debug log.
However, there are several things that make this difficult, including:
I want to offer a simple solution that should be implementable relatively quickly, as follows:
a. Have a distinct log level or log entry type that always logs. This could be a new log level, above FATAL, that cannot be disabled. Or it could be some other mechanism internal to rippled accomplishing the same.
b. Place in this log level essential information for initial diagnostics of a stability issue pertaining to consensus.
Here's how I picture this being implemented. Namely, there are several steps in consensus that a required to work consistently with each node in close to lock step. The information does not need to be extremely verbose, but should include the following (all having timestamps and some measuring wall clock durations):
This proposal is not attempting to do the following:
Those are big efforts. Instead, this proposal aims to make it relatively easy for us to help each other out with diagnostics that can provide a starting point for trouble-shooting beyond flying blind with speculation. I think the rippled part can be implemented relatively quickly, within a couple of days, and then folded into the next release. Then, there will be no special configuration for node operators. It will slightly increase log verbosity.
Beta Was this translation helpful? Give feedback.
All reactions