-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for distributed deployments with multiple daemons #256
Conversation
Ensures that we set them also when using the daemon's `run_dataflow` function for examples.
Instead of re-connecting for each message.
Allow passing runtime path via coordinator
I think there is a lot of good ideas in this PR, so thanks Philipp! I think that we maybe lacking a file-system management piece of software to manage file between machines but also changes. I was looking at the implementation of cargo and how it manages In the way But this is a big features that needs its own PR, and I think we can assume in this PR that the user has made the appropriate changes within its filesystem. I will test it maybe further later during the day. |
Yeah, we still need some kind of deploy functionality to get the nodes and operators from the CLI machine to the target machines. We already support URL sources for nodes and operators,
Agreed. There are currently only two options to distribute the node/operator binaries across machines:
I think the second option is already quite convenient for "finished" nodes and operators, but it's cumbersome during development. We should try to make the edit->compile->deploy cycle easier for distributed dataflows too. Maybe we could make the CLI send the executables via TCP as part of the spawn command? The receiving daemon could store them to the file system and then run them. This would make things much easier for nodes/operators written in Rust, as you could just develop as usual on the CLI machine and the |
…plates, and docs
Removes the separate `dora-runtime` binary. The runtime can now be started by passing `--run-dora-runtime` to `dora-daemon`. This change makes setup and deployment easier since it removes one executable that needs to be copied across machines.
Integrate `dora-runtime` into `dora-daemon`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I retested this branch with the implementation of the log branch and it is ok for me. An example of the YAML description with local
and remote
will be appreciated before merging.
Local and remote seems a little confusing to me regarding to distributed deployment, i.e., respect to which machine is local and/or remote? I think we should have an end-2-end dataflow graph describing distributed deployment at the control plane (CLI / coordinator layer), specifying pub/sub communication middleware provider (e.g., zenoh, DDS, or SOME/IP) with their corresponding configuration. |
As discussed in today's meeting:
|
We still need to pass the path through a new `working_dir` field as we haven't figured out deployment of compiled nodes/operators yet.
The path should also be valid on the receiving node, which might run in a different directory.
02db1a5
to
f621920
Compare
…g coordinator The coordinator is our control plane and should not be involved in data plane operations. This way, the dataflow can continue even if the coordinator fails.
I implemented the points discussed in the latest meeting and merged the latest changes from |
deploy.machine
key to the dataflow YAML format.--machine-id
argument, defaulting to the empty string. Multiple daemons can be connected to a coordinator, as long as they have different machine IDs.--port
argument to change the port that the coordinator listens on. Defaults to 53290.--coordinator-addr
argument to set the IP and port number of the coordinator. Defaults to127.0.0.1:53290
.InputClosed
events between machines.