Add option to send `stdout` as node/operator output #388

haixuanTao · 2023-12-01T18:34:56Z

This PR makes it possibles for nodes to receive the logs of other nodes.

This makes it possible to see logs within the plotting window for example as shown in the example.

This is aim at potentially simplify debugging when using hot-reloading features or even in a further future auto-debugger.

phil-opp

I like the idea that you can subscribe to log outputs of other operators/nodes, but I'm not sure if the special op/logs output is the best approach for this. How about we introduce a node-level capture_logs switch to enable the sending of the DoraEvent::Logs messages? This way, it would be more obvious that some dora-provided functionality is used.

For subscribers we could make the logs available through a new built-in input stream, e.g. named dora/logs/<node_id>. This would make it clear that this is some data provided by dora, not a normal node/operator output.

What do you think? I can implement the above if you like.

binaries/daemon/src/spawn.rs

examples/python-operator-dataflow/dataflow.yml

haixuanTao · 2023-12-11T10:46:15Z

Commenting on dora/logs/<node_id>, I think that this might create some hidden connections between node that can later be hard to manage. I would prefer if they were only one way of creating links between nodes. Maybe we can name the output: captured_logs or stdout?

Sure thing if you want to do the implementation

phil-opp · 2023-12-12T20:30:36Z

Commenting on dora/logs/<node_id>, I think that this might create some hidden connections between node that can later be hard to manage. I would prefer if they were only one way of creating links between nodes.

Good point! How about we introduce a capture_logs_to: output_name config key, which allows users to specify the output that they want to use for logs? This way, we have all outputs still listed under the outputs key without a "magic" output name.

Apart from the syntax that we want to use, there is a second challenge. Right now we capture the output per node, not per operator. So the operator logs would also contain the outputs of other operators on that node.

Ideally we would split the output properly, but this is difficult sind stdout is set per process. Maybe we can make the dora runtime print special separator messages before and after calling into an operator, which the daemon could then use to split the split the output by operator... I'll look into it.

haixuanTao · 2023-12-13T09:39:45Z

Okay! I think that this can also be useful for Opentelemetry logs in the future to tag log at the operator level as well.

Shall we maybe merge a first version at custom nodes level first?

We can then prioritize operator level in the future?

phil-opp · 2023-12-13T11:49:29Z

Shall we maybe merge a first version at custom nodes level first?

Sure, I'm fine with that.

phil-opp · 2023-12-13T12:19:59Z

How about we introduce a capture_logs_to: output_name config key, which allows users to specify the output that they want to use for logs? This way, we have all outputs still listed under the outputs key without a "magic" output name.

I opened a draft PR at #392 to add a new send_stdout_as: <output_name> to custom nodes.

I decided for stdout instead of log in the name because I would like to implement a custom logger for dora at some point. For Rust code, this would mean that we provide set a custom log logger, which gives us more context than the normal println (e.g. proper start/end for multi-line messages, log level, source location, etc). For Python, we could provide a dora.log function that should be used instead of calling print. This function would also accept an optional log level (info, warn, err, etc.). Using this approach, we could easily assign log messages to operators, even in runtimes with multiple operators. What do you think, @haixuanTao?

haixuanTao · 2023-12-13T13:10:25Z

Sure! I think we can separate stdout and log.

FYI, for python, we can do something in the likes of tracing_for_pyo3_logging https://docs.rs/tracing-for-pyo3-logging/latest/src/tracing_for_pyo3_logging/lib.rs.html#1-63

phil-opp · 2023-12-13T14:55:10Z

FYI, for python, we can do something in the likes of tracing_for_pyo3_logging https://docs.rs/tracing-for-pyo3-logging/latest/src/tracing_for_pyo3_logging/lib.rs.html#1-63

Ah, that's nice!

haixuanTao · 2023-12-13T15:22:52Z

Relevant: https://docs.rs/pyo3-opentelemetry/0.2.0/src/pyo3_opentelemetry/lib.rs.html#16-186

haixuanTao · 2023-12-18T11:21:17Z

Maybe we can also create a features feature flag configuration input for nodes in a similar way that Cargo.toml has feature flags for crates and stdout or logs can be one of them. In the future we can make it configurable to put more configuration flags.

haixuanTao · 2023-12-19T13:30:52Z

Or as an environment variable as well for the specific node

…ere is multiple operators

haixuanTao · 2024-02-28T11:38:34Z

So, I think that we can merge this. I have added the possibility to send_stdout_as for operators and added warnings that we do not split stdout between operators.

phil-opp

There are some unused import warnings that we should fix, otherwise this looks good to me. Thanks!

libraries/core/src/descriptor/mod.rs

…ime node

haixuanTao self-assigned this Dec 5, 2023

haixuanTao marked this pull request as ready for review December 5, 2023 16:14

phil-opp reviewed Dec 8, 2023

View reviewed changes

binaries/daemon/src/spawn.rs Outdated Show resolved Hide resolved

binaries/daemon/src/spawn.rs Outdated Show resolved Hide resolved

examples/python-operator-dataflow/dataflow.yml Outdated Show resolved Hide resolved

phil-opp mentioned this pull request Dec 13, 2023

Add new send_stdout_as key for capturing stdout of custom nodes #392

Merged

haixuanTao force-pushed the output-logs branch from b93edb1 to e85ce44 Compare February 12, 2024 19:43

haixuanTao force-pushed the output-logs branch from d2ac2f6 to 9b3b7af Compare February 28, 2024 11:33

haixuanTao and others added 5 commits February 28, 2024 12:36

copy_array_into_sample do not need to return a result

629a218

Adding log event

91bd7da

Adding log example

61bdd4b

Add new send_stdout_as key for capturing stdout of custom nodes

daa694a

Add possibility to send stdout for operators and add warnings when th…

aa81da0

…ere is multiple operators

haixuanTao force-pushed the output-logs branch from 9b3b7af to aa81da0 Compare February 28, 2024 11:37

Replace logs with stdout

1e659c6

phil-opp changed the title ~~Output logs~~ Add option to send stdout as node/operator output Feb 28, 2024

phil-opp approved these changes Feb 28, 2024

View reviewed changes

libraries/core/src/descriptor/mod.rs Outdated Show resolved Hide resolved

haixuanTao added 2 commits February 29, 2024 14:21

Removing unused imports

12af6a1

Make send_stdout_as fail if there is more than one entry for a runt…

b32a7e4

…ime node

haixuanTao enabled auto-merge February 29, 2024 13:47

haixuanTao merged commit 2615b04 into main Feb 29, 2024
17 checks passed

haixuanTao deleted the output-logs branch February 29, 2024 13:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to send `stdout` as node/operator output #388

Add option to send `stdout` as node/operator output #388

haixuanTao commented Dec 1, 2023

phil-opp left a comment

haixuanTao commented Dec 11, 2023

phil-opp commented Dec 12, 2023

haixuanTao commented Dec 13, 2023

phil-opp commented Dec 13, 2023

phil-opp commented Dec 13, 2023

haixuanTao commented Dec 13, 2023

phil-opp commented Dec 13, 2023

haixuanTao commented Dec 13, 2023

haixuanTao commented Dec 18, 2023

haixuanTao commented Dec 19, 2023

haixuanTao commented Feb 28, 2024

phil-opp left a comment

Add option to send stdout as node/operator output #388

Add option to send stdout as node/operator output #388

Conversation

haixuanTao commented Dec 1, 2023

phil-opp left a comment

Choose a reason for hiding this comment

haixuanTao commented Dec 11, 2023

phil-opp commented Dec 12, 2023

haixuanTao commented Dec 13, 2023

phil-opp commented Dec 13, 2023

phil-opp commented Dec 13, 2023

haixuanTao commented Dec 13, 2023

phil-opp commented Dec 13, 2023

haixuanTao commented Dec 13, 2023

haixuanTao commented Dec 18, 2023

haixuanTao commented Dec 19, 2023

haixuanTao commented Feb 28, 2024

phil-opp left a comment

Choose a reason for hiding this comment

Add option to send `stdout` as node/operator output #388

Add option to send `stdout` as node/operator output #388