Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run dynamic node #517

Merged
merged 11 commits into from
Jun 13, 2024
Merged

Run dynamic node #517

merged 11 commits into from
Jun 13, 2024

Conversation

haixuanTao
Copy link
Collaborator

@haixuanTao haixuanTao commented May 29, 2024

Description

This PR makes it possible to run process as a dynamic process to be able to debug, run tests as well as run python file interactively.

Getting started

In the YAML graph

nodes:
  - id: webcam
    source: ./webcam.py
    inputs:
      tick:
        source: dora/timer/millis/50
        queue_size: 1000
    outputs:
      - image

  - id: object_detection
    source: ./object_detection.py
    inputs:
      image: webcam/image
    outputs:
      - bbox

  - id: plot
    source: dynamic
    inputs:
      image: webcam/image
      bbox: object_detection/bbox

For python

# Simply add a node_id when you initialize a node

node = Node("plot")

For rust

        DoraNode::init_from_node_id(None, NodeId::from("rust-sink-detached".to_string()), None)?;

To run

For python

dora start examples/python-dataflow/dataflow_detached.yml --name ci-python-detached
python examples/python-dataflow/plot_detached.py
sleep 5
dora stop --name ci-python-test  --grace-duration 5s

For rust

dora start examples/rust-dataflow/dataflow_detached.yml --name ci-rust-detached
cargo run -p rust-dataflow-example-sink-detached
sleep 5
dora stop --name ci-rust-detached  --grace-duration 5s

@phil-opp
Copy link
Collaborator

So the idea of this is that nodes are started manually and then matched by name, am I understanding this correctly? So kind of a hybrid between the current declarative apporach and an "interactive dataflow" feature, where nodes can connect to the daemon dynamically, without being specified in a dataflow YAML file.

I'm not sure if detached is the best name for this. To me, the term "detach" is associated with detached threads, which keep running in the background without a join handle. So at first, I thought that detaching a node keeps it running even after the rest of the dataflow stops. Perhaps manual_start (or similar) is a better name for that flag?

Another things that I don't understand yet: Are the build and source keys considered at all for such a detached/manually started node? To me, it would make sense to disallow these keys for manually started nodes to clearly indicate that the matching needs to be done by name.

@haixuanTao haixuanTao force-pushed the detached-python-process branch from 48397cf to f050496 Compare May 29, 2024 12:32
@haixuanTao
Copy link
Collaborator Author

Yes, basically, you can spawn a new node without using dora start and the node will be able to connect to the running dataflow by querying the coordinator for the DORA_NODE_CONFIG.

Additional change enables to not spawn the node as well as not close its output when stopping so that we can spawn multiple time a "detached" node.

In my opinion, this does not break the declarative paradigm as the input and outputs are still specified within the dataflow, just not the spawning.

The buildand source keys are not used. The idea is that by adding detached you can quickly debug the dataflow. You can also safely remove detachedand everything still work.

I think we can add ẁarnings that the source and build key are not going to be used.

@phil-opp
Copy link
Collaborator

The buildand source keys are not used. The idea is that by adding detached you can quickly debug the dataflow. You can also safely remove detachedand everything still work.

I think we can add ẁarnings that the source and build key are not going to be used.

I'm still in favor of throwing an error. Commenting out the build and source keys is quick and simple and it makes it more obvious what's happening. Silently ignoring the build key is a huge footgun because it subtly changes the behavior of dora build, so you might end up with a long debug debug session where you never apply the changes that you made to a detached node (if you forget to build it manually after dora build). Ignoring source is less dangerous IMO, but I can imagine that it still leads to confusion, e.g. if somebody accidentally commits a dataflow with both source and detached set, maybe separated by lots of inputs and outputs.

A warning would be better than nothing, but it is easy to miss.

@phil-opp
Copy link
Collaborator

not close its output when stopping so that we can spawn multiple time a "detached" node.

I don't think that this is a good idea because it's a subtle behavior change in a feature intended for debugging. Consider the following situation:

  • I want to debug why my dataflow is not exiting properly after a source node exits.
  • I discover the detached feature and decide to use it for debugging.
  • I make some node in my dataflow a detached node and try to get the dataflow to finish automatically.
  • Now it never finishes anymore because outputs of of detached nodes are never closed.

@phil-opp
Copy link
Collaborator

phil-opp commented May 29, 2024

Side notes, not directly related to this PR:

  • I think it would be a good idea to introduce "dynamic nodes" that can connect to a dataflow at runtime. For example, you could spawn a logger node that prints out some specific messages (sent by some node specified in the dataflow).
  • We could also add "global topics" as a form of eternal output streams. Multiple nodes could publish to a topic and dynamic nodes could publish to them too. Topics could be mapped as inputs.
    • This gives us new ways to control dataflows at runtime. For example, a CLI could send a control command to a control topic, which is then handled by one of the traditional nodes specified in the YAML file.

@haixuanTao @heyong4725 What do you think about this?

@haixuanTao
Copy link
Collaborator Author

So, to be more explicit, I have added a filter on the daemon to only wait for nodes that are not detached. So using:

  • dora stop
  • ctrl + c on the cli

Will gracefully stop the dataflow.

I understand that this could potentially introduce additional edge cases, but it seems to me that being able to start and stop a specific node be more useful than the worries about not being able to debug a downstream node that might not gracefully stop from InputClosed situation.

@haixuanTao
Copy link
Collaborator Author

Side notes, not directly related to this PR:

* I think it would be a good idea to introduce "dynamic nodes" that can connect to a dataflow at runtime. For example, you could spawn a logger node that prints out some specific messages (sent by some node specified in the dataflow).

* We could also add "global topics" as a form of eternal output streams. Multiple nodes could publish to a topic and dynamic nodes could publish to them too. Topics could be mapped as inputs.
  
  * This gives us new ways to control dataflows at runtime. For example, a CLI could send a control command to a `control` topic, which is then handled by one of the traditional nodes specified in the YAML file.

@haixuanTao @heyong4725 What do you think about this?

Pinging back on this, I hope that this PR is a stepping stone on having more dynamic node indeed to have more freedom to send message to the dataflow.

@haixuanTao haixuanTao force-pushed the detached-python-process branch 4 times, most recently from 15c8286 to 42865e3 Compare May 30, 2024 15:41
@haixuanTao
Copy link
Collaborator Author

What about using source <=> path use another special keyword like: interactive so that source is always required and we don't have to make it optional, and we don't need to introduce another key.

Just to be clear, this PR don't touch at the build script. So anything calling dora build will be build just as before, so I don't understand how this will be a footgun. Only spawning the process is the logic changed.

Back on the topic of not closing input. I think that it is more intuitive that the node are not stopped because a node is interactive than, that node are stopped after closing the interactive node.

@haixuanTao haixuanTao changed the title Run Detached node Run interactive node May 31, 2024
@haixuanTao
Copy link
Collaborator Author

It would look something like:

nodes:
  - id: webcam
    source: ./webcam.py
    inputs:
      tick:
        source: dora/timer/millis/50
        queue_size: 1000
    outputs:
      - image

  - id: object_detection
    source: ./object_detection.py
    inputs:
      image: webcam/image
    outputs:
      - bbox

  - id: plot
    source: interactive
    inputs:
      image: webcam/image
      bbox: object_detection/bbox

@phil-opp
Copy link
Collaborator

What about using source <=> path use another special keyword like: interactive so that source is always required and we don't have to make it optional, and we don't need to introduce another key.

Good idea, I like it!

Just to be clear, this PR don't touch at the build script. So anything calling dora build will be build just as before, so I don't understand how this will be a footgun. Only spawning the process is the logic changed.

Ah, then disregard my concern. I misinterpreted your "The buildand source keys are not used." comment above.

binaries/coordinator/src/lib.rs Outdated Show resolved Hide resolved
binaries/coordinator/src/lib.rs Outdated Show resolved Hide resolved
binaries/coordinator/src/lib.rs Outdated Show resolved Hide resolved
libraries/config.yaml Outdated Show resolved Hide resolved
libraries/core/Cargo.toml Outdated Show resolved Hide resolved
apis/rust/node/Cargo.toml Outdated Show resolved Hide resolved
apis/rust/node/src/node/mod.rs Outdated Show resolved Hide resolved
apis/rust/node/src/node/mod.rs Outdated Show resolved Hide resolved
apis/rust/node/src/node/mod.rs Outdated Show resolved Hide resolved
@haixuanTao haixuanTao changed the title Run interactive node Run dynamic node Jun 5, 2024
@haixuanTao haixuanTao force-pushed the detached-python-process branch from 42865e3 to 3c21fd6 Compare June 8, 2024 14:56
@haixuanTao haixuanTao force-pushed the detached-python-process branch from 3c21fd6 to 3ad402c Compare June 8, 2024 14:58
@haixuanTao haixuanTao force-pushed the detached-python-process branch from 1dd8f15 to d180aa7 Compare June 8, 2024 16:55
@haixuanTao haixuanTao force-pushed the detached-python-process branch from c15356b to fe8722b Compare June 9, 2024 08:05
apis/python/node/src/lib.rs Show resolved Hide resolved
apis/rust/node/src/daemon_connection/tcp.rs Show resolved Hide resolved
apis/rust/node/src/node/mod.rs Show resolved Hide resolved
apis/rust/node/src/node/mod.rs Outdated Show resolved Hide resolved
apis/rust/node/src/node/mod.rs Outdated Show resolved Hide resolved
binaries/cli/src/main.rs Outdated Show resolved Hide resolved
binaries/cli/src/main.rs Outdated Show resolved Hide resolved
libraries/core/src/daemon_messages.rs Show resolved Hide resolved
@haixuanTao
Copy link
Collaborator Author

haixuanTao commented Jun 12, 2024

FYI @phil-opp , I made a new init_flexible within Rust which is the default implementation for Python.

I have removed an assertion on node_id to make it less complicated and always use DORA_NODE_CONFIG if it exist as it is very likely coming from the daemon.

Would love to have a last round of review to make sure we're on the same page.

Copy link
Collaborator

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, looks ready for merging!

@haixuanTao haixuanTao merged commit f34cbe2 into main Jun 13, 2024
18 checks passed
@haixuanTao haixuanTao deleted the detached-python-process branch June 13, 2024 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants