Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Use async/await instead of manual polling of NetworkWorker #13219

Merged
merged 43 commits into from
Feb 20, 2023

Conversation

dmitry-markin
Copy link
Contributor

@dmitry-markin dmitry-markin commented Jan 23, 2023

Resolves #6919. Re-does #5704.

I also tried rewriting sc-network-test to use async/await instead of polling, the first attempt is here: 4b5d851. It turned out that doing it without slowing down the tests is less trivial than I expected, so I think we should do it in a separate PR.

cumulus companion: paritytech/cumulus#2127

@dmitry-markin dmitry-markin added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit labels Jan 23, 2023
Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

client/network/src/service.rs Outdated Show resolved Hide resolved
@koute
Copy link
Contributor

koute commented Feb 6, 2023

It'd be interesting to see the impact of this on the CPU usage; the original comment here said that there's a high volume of messages here, so this might end up being less efficient overall. If so we can probably do some further refactoring to make it better.

client/network/src/service.rs Outdated Show resolved Hide resolved
client/network/src/service.rs Outdated Show resolved Hide resolved
Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why we need next_action. We should just rename it to async fc run. When this run function returns, it means that the worker is finished. I don't think we need any next_action?

client/network/src/service.rs Outdated Show resolved Hide resolved
client/network/src/service.rs Outdated Show resolved Hide resolved
@altonen
Copy link
Contributor

altonen commented Feb 6, 2023

I don't get why we need next_action. We should just rename it to async fc run. When this run function returns, it means that the worker is finished. I don't think we need any next_action?

next_action() is probably useful in tests where we want to advance the network by one event and then observe effects. There could be another function async fn run() which would just call next_event() in a loop.

@dmitry-markin
Copy link
Contributor Author

dmitry-markin commented Feb 6, 2023

I don't get why we need next_action. We should just rename it to async fc run. When this run function returns, it means that the worker is finished. I don't think we need any next_action?

@bkchr: should be resolved now.

@dmitry-markin dmitry-markin requested a review from bkchr February 6, 2023 12:43
@dmitry-markin
Copy link
Contributor Author

dmitry-markin commented Feb 6, 2023

It'd be interesting to see the impact of this on the CPU usage; the original comment here said that there's a high volume of messages here, so this might end up being less efficient overall. If so we can probably do some further refactoring to make it better.

@koute What would be the proper profiling procedure in this case? Is running the node on a VPS and just measuring the CPU load enough (after sync?), or something more tricky is needed? How to account for varying conditions across the runs (peer count, transactions per block, etc.)?

@dmitry-markin
Copy link
Contributor Author

The PR is ready for review again. @bkchr @altonen could you review the changes since faec2e9?

Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things still need to be done, but looks good!

client/network/test/src/lib.rs Show resolved Hide resolved
client/network/src/service/tests/mod.rs Outdated Show resolved Hide resolved
client/network/src/service/tests/chain_sync.rs Outdated Show resolved Hide resolved
client/network/src/service/tests/chain_sync.rs Outdated Show resolved Hide resolved
client/network/src/service/tests/chain_sync.rs Outdated Show resolved Hide resolved
Comment on lines +1373 to +1379
let external_addresses =
self.network_service.external_addresses().map(|r| &r.addr).cloned().collect();
*self.external_addresses.lock() = external_addresses;

let listen_addresses =
self.network_service.listeners().map(ToOwned::to_owned).collect();
*self.listen_addresses.lock() = listen_addresses;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think that the listen addresses are updated ever. You pass them via cli, how should the listen address change? Or are these the addresses other nodes have reported to us?

Nevertheless, there we should create an issue to get rid off these mutexes.

Copy link
Contributor Author

@dmitry-markin dmitry-markin Feb 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turned out that they do change during runtime — if we listen on a wildcard addresses like 0.0.0.0 and ::, Swarm reports all actual endpoints we are listening on (enumerating all interfaces), and if we enable an interface / assign an ip address (like creating a dummy interface) it pops up in the list of addresses reported by RPC.

@dmitry-markin
Copy link
Contributor Author

bot merge

@paritytech-processbot paritytech-processbot bot merged commit 76208c8 into master Feb 20, 2023
@paritytech-processbot paritytech-processbot bot deleted the dm-network-service-async-await branch February 20, 2023 12:08
ltfschoen pushed a commit to ltfschoen/substrate that referenced this pull request Feb 22, 2023
…ech#13219)

* Convert `NetworkWorker::poll()` into async `next_action()`

* Use `NetworkWorker::next_action` instead of `poll` in `sc-network-test`

* Revert "Use `NetworkWorker::next_action` instead of `poll` in `sc-network-test`"

This reverts commit 4b5d851.

* Fix `sc-network-test` to poll `NetworkWorker::next_action`

* Fix `sc_network::service` tests to poll `NetworkWorker::next_action`

* Fix docs

* kick CI

* Factor out `next_worker_message()` & `next_swarm_event()`

* Error handling: replace `futures::pending!()` with `expect()`

* Simplify stream polling in `select!`

* Replace `NetworkWorker::next_action()` with `run()`

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* minor: comment

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Print debug log when network future is shut down

* Evaluate `NetworkWorker::run()` future once before the loop

* Fix client code to match new `NetworkService` interfaces

* Make clippy happy

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Revert "Apply suggestions from code review"

This reverts commit 9fa646d.

* Make `NetworkWorker::run()` consume `self`

* Terminate system RPC future if RPC rx stream has terminated.

* Rewrite with let-else

* Fix comments

* Get `best_seen_block` and call `on_block_finalized` via `ChainSync` instead of `NetworkService`

* rustfmt

* make clippy happy

* Tests: schedule wake if `next_action()` returned true

* minor: comment

* minor: fix `NetworkWorker` rustdoc

* minor: amend the rustdoc

* Fix bug that caused `on_demand_beefy_justification_sync` test to hang

* rustfmt

* Apply review suggestions

---------

Co-authored-by: Bastian Köcher <[email protected]>
ark0f pushed a commit to gear-tech/substrate that referenced this pull request Feb 27, 2023
…ech#13219)

* Convert `NetworkWorker::poll()` into async `next_action()`

* Use `NetworkWorker::next_action` instead of `poll` in `sc-network-test`

* Revert "Use `NetworkWorker::next_action` instead of `poll` in `sc-network-test`"

This reverts commit 4b5d851.

* Fix `sc-network-test` to poll `NetworkWorker::next_action`

* Fix `sc_network::service` tests to poll `NetworkWorker::next_action`

* Fix docs

* kick CI

* Factor out `next_worker_message()` & `next_swarm_event()`

* Error handling: replace `futures::pending!()` with `expect()`

* Simplify stream polling in `select!`

* Replace `NetworkWorker::next_action()` with `run()`

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* minor: comment

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Print debug log when network future is shut down

* Evaluate `NetworkWorker::run()` future once before the loop

* Fix client code to match new `NetworkService` interfaces

* Make clippy happy

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Revert "Apply suggestions from code review"

This reverts commit 9fa646d.

* Make `NetworkWorker::run()` consume `self`

* Terminate system RPC future if RPC rx stream has terminated.

* Rewrite with let-else

* Fix comments

* Get `best_seen_block` and call `on_block_finalized` via `ChainSync` instead of `NetworkService`

* rustfmt

* make clippy happy

* Tests: schedule wake if `next_action()` returned true

* minor: comment

* minor: fix `NetworkWorker` rustdoc

* minor: amend the rustdoc

* Fix bug that caused `on_demand_beefy_justification_sync` test to hang

* rustfmt

* Apply review suggestions

---------

Co-authored-by: Bastian Köcher <[email protected]>
Ank4n pushed a commit that referenced this pull request Feb 28, 2023
* Convert `NetworkWorker::poll()` into async `next_action()`

* Use `NetworkWorker::next_action` instead of `poll` in `sc-network-test`

* Revert "Use `NetworkWorker::next_action` instead of `poll` in `sc-network-test`"

This reverts commit 4b5d851.

* Fix `sc-network-test` to poll `NetworkWorker::next_action`

* Fix `sc_network::service` tests to poll `NetworkWorker::next_action`

* Fix docs

* kick CI

* Factor out `next_worker_message()` & `next_swarm_event()`

* Error handling: replace `futures::pending!()` with `expect()`

* Simplify stream polling in `select!`

* Replace `NetworkWorker::next_action()` with `run()`

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* minor: comment

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Print debug log when network future is shut down

* Evaluate `NetworkWorker::run()` future once before the loop

* Fix client code to match new `NetworkService` interfaces

* Make clippy happy

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Revert "Apply suggestions from code review"

This reverts commit 9fa646d.

* Make `NetworkWorker::run()` consume `self`

* Terminate system RPC future if RPC rx stream has terminated.

* Rewrite with let-else

* Fix comments

* Get `best_seen_block` and call `on_block_finalized` via `ChainSync` instead of `NetworkService`

* rustfmt

* make clippy happy

* Tests: schedule wake if `next_action()` returned true

* minor: comment

* minor: fix `NetworkWorker` rustdoc

* minor: amend the rustdoc

* Fix bug that caused `on_demand_beefy_justification_sync` test to hang

* rustfmt

* Apply review suggestions

---------

Co-authored-by: Bastian Köcher <[email protected]>
ukint-vs pushed a commit to gear-tech/substrate that referenced this pull request Apr 10, 2023
…ech#13219)

* Convert `NetworkWorker::poll()` into async `next_action()`

* Use `NetworkWorker::next_action` instead of `poll` in `sc-network-test`

* Revert "Use `NetworkWorker::next_action` instead of `poll` in `sc-network-test`"

This reverts commit 4b5d851.

* Fix `sc-network-test` to poll `NetworkWorker::next_action`

* Fix `sc_network::service` tests to poll `NetworkWorker::next_action`

* Fix docs

* kick CI

* Factor out `next_worker_message()` & `next_swarm_event()`

* Error handling: replace `futures::pending!()` with `expect()`

* Simplify stream polling in `select!`

* Replace `NetworkWorker::next_action()` with `run()`

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* minor: comment

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Print debug log when network future is shut down

* Evaluate `NetworkWorker::run()` future once before the loop

* Fix client code to match new `NetworkService` interfaces

* Make clippy happy

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Revert "Apply suggestions from code review"

This reverts commit 9fa646d.

* Make `NetworkWorker::run()` consume `self`

* Terminate system RPC future if RPC rx stream has terminated.

* Rewrite with let-else

* Fix comments

* Get `best_seen_block` and call `on_block_finalized` via `ChainSync` instead of `NetworkService`

* rustfmt

* make clippy happy

* Tests: schedule wake if `next_action()` returned true

* minor: comment

* minor: fix `NetworkWorker` rustdoc

* minor: amend the rustdoc

* Fix bug that caused `on_demand_beefy_justification_sync` test to hang

* rustfmt

* Apply review suggestions

---------

Co-authored-by: Bastian Köcher <[email protected]>
nathanwhit pushed a commit to nathanwhit/substrate that referenced this pull request Jul 19, 2023
…ech#13219)

* Convert `NetworkWorker::poll()` into async `next_action()`

* Use `NetworkWorker::next_action` instead of `poll` in `sc-network-test`

* Revert "Use `NetworkWorker::next_action` instead of `poll` in `sc-network-test`"

This reverts commit 4b5d851.

* Fix `sc-network-test` to poll `NetworkWorker::next_action`

* Fix `sc_network::service` tests to poll `NetworkWorker::next_action`

* Fix docs

* kick CI

* Factor out `next_worker_message()` & `next_swarm_event()`

* Error handling: replace `futures::pending!()` with `expect()`

* Simplify stream polling in `select!`

* Replace `NetworkWorker::next_action()` with `run()`

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* minor: comment

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Print debug log when network future is shut down

* Evaluate `NetworkWorker::run()` future once before the loop

* Fix client code to match new `NetworkService` interfaces

* Make clippy happy

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Revert "Apply suggestions from code review"

This reverts commit 9fa646d.

* Make `NetworkWorker::run()` consume `self`

* Terminate system RPC future if RPC rx stream has terminated.

* Rewrite with let-else

* Fix comments

* Get `best_seen_block` and call `on_block_finalized` via `ChainSync` instead of `NetworkService`

* rustfmt

* make clippy happy

* Tests: schedule wake if `next_action()` returned true

* minor: comment

* minor: fix `NetworkWorker` rustdoc

* minor: amend the rustdoc

* Fix bug that caused `on_demand_beefy_justification_sync` test to hang

* rustfmt

* Apply review suggestions

---------

Co-authored-by: Bastian Köcher <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Refactor sc_network::NetworkService::poll to use async/await
5 participants