-
Notifications
You must be signed in to change notification settings - Fork 782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Critical dependency logging #4988
Critical dependency logging #4988
Conversation
…itical-dependency-logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs some changes to correctly obtain at least the gossipsub messages, there are also some open questions about other deps
match *meta.level() { | ||
tracing_core::Level::INFO => { | ||
let _ = file_writer.write(visitor.message); | ||
} | ||
tracing_core::Level::WARN => { | ||
let _ = file_writer.write(visitor.message); | ||
} | ||
tracing_core::Level::ERROR => { | ||
let _ = file_writer.write(visitor.message); | ||
} | ||
_ => {} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you filtering out debug messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no real reason, I'll add debug messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't think you should be filtering here by level, since that can be controlled with the RUST_LOG
flag
|
||
lazy_static! { | ||
pub static ref TRACING_LOGGING_DEPENDENCIES: Vec<String> = | ||
vec!["libp2p".to_string(), "discv5".to_string()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that none of the libp2p
dependencies has a target/module path named libp2p. I hacked my changes to get metrics on debug to showcase some of the names we will actually get
dep_debug_total{target="libp2p_core"} 11
dep_debug_total{target="libp2p_dns"} 8
dep_debug_total{target="libp2p_gossipsub"} 273
dep_debug_total{target="libp2p_mplex"} 31
dep_debug_total{target="libp2p_quic"} 2
dep_debug_total{target="libp2p_swarm"} 15
dep_debug_total{target="libp2p_tcp"} 13
we probably need to decide which of these we want and what to do with enabled logs of deps not on the list @AgeManning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now capturing libp2p_gossipsub, but can make changes to include some, or all, of the libp2p dependencies based on what we decide
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think just gossipsub is fine initially. We can maybe add some others in later if things get really spicy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! This looks great! Thanks a lot!
Had a few comments, some are just questions :)
let _ = NonBlockingFileWriter::clear_file(&path); | ||
} | ||
|
||
if let Err(e) = writeln!(file, "{}", message) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be useful to tag on the time here.
Typically when things go wrong, they break at some specific time. It would be very useful to have a timestamp so we can cross-reference with metrics/lighthouse logs.
} | ||
|
||
fn clear_file(path: &PathBuf) -> std::io::Result<()> { | ||
File::create(path)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for us to maybe just nuke half the file?
Just worried about the case that we get a bug, but we have just recently removed all our logs and have no history.
lighthouse/src/main.rs
Outdated
if matches.is_present("env_log") { | ||
Builder::from_env(Env::default()).init(); | ||
eprintln!("The -l flag is DEPRECATED. Dependency logging will be on by default."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably just remove this conditional altogether.
I dont think we ever use logging to terminal outside of slog, so probably not worth having this exception here. I'm not sure anyone is really using this flag anyway, except for the devs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok talked to paul and he tells me you discussed this. We are then NOT removing the flag, simply the warning. Leaving here for the record
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in c222b36
@@ -214,6 +224,56 @@ impl TimeLatch { | |||
} | |||
} | |||
|
|||
pub fn create_tracing_layer(logfile_max_size: u64, base_tracing_log_path: PathBuf) { | |||
let filter_layer = match tracing_subscriber::EnvFilter::try_from_default_env() | |||
.or_else(|_| tracing_subscriber::EnvFilter::try_new("warn")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we use debug by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one's for me, will change to debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in c222b36
} | ||
}; | ||
|
||
for message in receiver { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something here.
Shouldn't we have a loop here somewhere. It seems like this thread just reads all the messages, writes them, then ends?
It also looks like there is one of these threads per dependency/file. I was expecting an await, or perhaps this tasks continually gets created, but I only saw one initialisation. I'm likely missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we have a loop here somewhere. It seems like this thread just reads all the messages, writes them, then ends?
As far as i understand it, the mpsc channel creates a sender and receiver. The receiver blocks until a message is available (as long as sender still exists).
https://doc.rust-lang.org/std/sync/mpsc/fn.channel.html
So that for message in receiver
loop blocks until a message is received and should never end as long as the sender exists.
It also looks like there is one of these threads per dependency/file. I was expecting an await, or perhaps this tasks continually gets created, but I only saw one initialisation. I'm likely missing something.
There is one thread per dependency, the stream initialization per dependency happens here
Might be a bit overkill to spawn a new stream per dependency, I can change it to one stream for all dependencies if that makes more sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah all good. Sorry have been in tokio async land too long and forgot about std channels :p
common/logging/src/lib.rs
Outdated
{ | ||
Ok(filter) => filter, | ||
Err(e) => { | ||
eprintln!("Failed to initialize dependency tracing {e}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at this point we don't have a useable slogger, so we are forced into prtinln! right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep that's the reason
@divagant-martian has suggested some traits that might make this a bit nicer. There is this trait: https://docs.rs/tracing-subscriber/0.3.18/tracing_subscriber/fmt/writer/trait.MakeWriter.html which might help with logging fancy logs and timestamps etc. There is this one: https://docs.rs/tracing-appender/latest/tracing_appender/non_blocking/struct.NonBlocking.html And this one: https://docs.rs/tracing-appender/latest/tracing_appender/rolling/struct.RollingFileAppender.html Which might help with writing to a file with in a rolling fashion so we can maintain the size rather than clearing half periodically, etc. |
those suggestions seem to have simplified some things. I think this should be ready for another review |
discv5 logs
libp2p logs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. This looks much cleaner.
We are doing some releases at the moment. So will get this guy in once we're accepting more code.
Issue Addressed
#4954
Proposed Changes
This PR proposes adding logs for the following dependencies
Each dependency has its own separate log file. Using the tracing appender library a non blocking file stream is created for each dependency. On each tracing event a timestamp, log level and log message are written to the respective log file. The files are rotated daily
Example discv5 logs
Example libp2p-gossipsub logs
Additional Info
This builds on top of changes made in #4979. It may be a bit difficult to review until that PR is merged
Most of my changes can be found in the following files
lighthouse/src/main.rs
lighthouse/common/logging/src/lib.rs
lighthouse/common/logging/src/tracing_logging_layer.rs