Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading actix-web #291

Open
antiblue opened this issue Sep 20, 2023 · 14 comments
Open

Upgrading actix-web #291

antiblue opened this issue Sep 20, 2023 · 14 comments

Comments

@antiblue
Copy link

Hi all, I started working on upgrading the Actix-Web parts, but I stumble over an issue with Tokio.

Through OPCUASession.connect the function get_server_endpoints_from_url<T> is called and at the end an instance of Session is dropped.
This causes a panic, because Tokio wants to enter a blocking region:

thread 'actix-rt|system:0|arbiter:3' panicked at 'Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context.', ...tokio-1.32.0\src\runtime\blocking\shutdown.rs:51:21

I have found running-actix-web-using-tokiomain in Actix-Web's documentation, stating that block_in_place will not work. Unfortunately the backtrace reveals exactly that:

  13: core::ptr::drop_in_place<opcua::client::session::session::Session>
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be\library\core\src\ptr\mod.rs:497
  14: opcua::client::client::Client::get_server_endpoints_from_url<ref$<str$> >
             at ...\opcua\lib\src\client\client.rs:493

Any suggestions on how to fix this?

(The code has been forked here)

@AiyionPrime
Copy link
Contributor

AiyionPrime commented Jul 1, 2024

@antiblue I'm struggling with the same problem. Was just drafting a different example structure in #355, when I hit this error as well. Did you make any progress with this?
I'd really like to see the example having the same actix version as the rest of the code.

@antiblue
Copy link
Author

antiblue commented Jul 1, 2024

No, sorry.

@AiyionPrime
Copy link
Contributor

After reading through the other PRs and recent commits, maybe @einarmo might be a good choice to ask for directions about this problem.

@einarmo
Copy link
Contributor

einarmo commented Jul 2, 2024

You shouldn't be having a trouble with a runtime being dropped in the client now, since the client no longer contains a runtime. Is this hitting somewhere else?

@AiyionPrime
Copy link
Contributor

AiyionPrime commented Jul 2, 2024

I encountered the problem when I drafted an update of samples/web-client to the actix version in the lib crate. I'll provide a PR tomorrow; today's kind of full; sorry.

@vlnzrv
Copy link

vlnzrv commented Jul 16, 2024

Hi @AiyionPrime! Any updates about this PR with the actix version upgrade?

@AiyionPrime
Copy link
Contributor

Hey @vlnzrv, sorry the topic has spread over various PRs and issues now.

I talked to the maintainer @locka99 a week ago, where he made clear that he was going to look into how to enable external support of this library.
But he also remarked his currently quite limited time, of which he can't spend as much in this project.

That and the amount of open PRs (about 18) have led me to stop opening PRs for the moment, until we find a way to reduce (and hopefully merge some) again.

My goal is to help this project, not drown it in work.
Nevertheless that's what I've already told @einarmo in a different context and was not aware others were awaiting this as well.

@vlnzrv
Copy link

vlnzrv commented Jul 17, 2024

Hey @AiyionPrime, thanks for the update and your contributions! Some comments why I was looking for a way to update it. In May rust nightly broke for old versions of time crate: rust-lang/rust#125319. And it's not going to be fixed according to that thread, so the only way around is to upgrade time crate. Opcua crate depends on time crate via actix-web, so it's a blocker for an update. In a couple of weeks rust stable will stop working for these old versions of time crate, so It'll be an annoying issue.

@AiyionPrime
Copy link
Contributor

AiyionPrime commented Jul 17, 2024

@vlnzrv Yeah I saw that, too.
It was not meant as "won't do" :)
I opened a draft PR #360 to discuss the shortcomings of my implementation.

One definitely is the non graceful shutdown during ctrl+c which results in the runtime getting terminated in async context. I think I have an idea for that though.

My code is not tested at all; I've got an interview to setup now, but will get back to this later in the evening.
Hope this helps anyway.

@AiyionPrime
Copy link
Contributor

@vlnzrv 1.80.0 is happening next Thursday, isn't it?

@AiyionPrime
Copy link
Contributor

Furthermore it's only the sample which is running the ancient version of time, if I'm not mistaken?
So temporarily dropping it would be a bad, but available fallback, I think.

@vlnzrv
Copy link

vlnzrv commented Jul 17, 2024

Yes, it's only a sample, because dependency in the lib itself was updated. In the worst case the sample could be just commented in Cargo.toml until it's fixed. It won't break unit / integration tests.

@vlnzrv
Copy link

vlnzrv commented Jul 17, 2024

.80.0 is happening next Thursday, isn't it?

Correct: https://releases.rs/docs/1.80.0/

@AiyionPrime
Copy link
Contributor

Alright. The PR I drafted to address the beta issue is #361. And the draft to verify the CI addition works is #362.

@vlnzrv Furthermore for your downstream repo you should be fine performing the same addition to the Cargo.toml in case this does not get merged in time.

This was referenced Aug 15, 2024
einarmo added a commit to einarmo/opcua that referenced this issue Aug 15, 2024
# What and Why

This is a rewrite of the server from scratch, with the primary goal of taking the server implementation from a limited, mostly embedded server, to a fully fledged, general purpose server SDK. The old way of using the server does still _more or less_ exist, see samples for the closest current approximation, but the server framework has changed drastically internally, and the new design opens the door for making far more complex and powerful OPC-UA servers.

## Goals

Currently my PC uses about ~1% CPU in release mode running the demo server, which updates 1000 variables once a second. This isn't bad, but I want this SDK to be able to handle servers with _millions_ of nodes. In practice this means several things:

 - It must be possible to store nodes externally, in some database or other system.
 - Monitored items must be notification based. There is always going to be sampling somewhere, but in practice large OPC-UA servers are _always_ push based, the polling is usually deferred to underlying systems, which may or may not be OPC-UA based at all.
 - It must be possible to store different sets of nodes in different ways. If anyone wanting to write a more complex server needs to reimplement diagnostics and the core namespace, the SDK isn't particularly useful. We could hard code that, but it seems better to create an abstraction.

## High level changes

First of all, there are some fundamental structural changes to better handle multiple clients and ensure compliance with the OPC-UA standard. Each TCP connection now runs in a tokio `task`, and most requests will actually spawn a `task` themselves. This is reasonably similar to how frameworks like `axum` handle web requests.

Subscriptions and sessions are now stored centrally, which allows us to implement `TransferSubscriptions` and properly handle subscriptions outliving their session as they are supposed to in OPC-UA. I think technically you can run multiple sessions on a single connection now, though I have no way to test this at the moment.

The web server is gone. It could have remained, but I think it deserves a rethink. It would be better (IMO), and deal with issues such as locka99#291, if we integrate with the `metrics` library, and optionally export some other metrics using some other generic interface. In general I think OPC-UA is plenty complicated enough without extending it with tangentially related features, though again this might be related to the shift I'm trying to create here from a specialized embedded server SDK, to a generic OPC-UA SDK.

Events are greatly changed, and quite unfinished. I believe a solid event implementation requires not just more thought, but a proper derive macro to make implementing them tolerable. The old approach relied on storing events as nodes, which works, and has some advantages, but it's not particularly efficient, and required setting a number of actually superfluous values, i.e. setting the displayname of an event, which is a value that cannot be accessed, as I understand it. The new approach is just storing them as structs, `dyn Event`.

## Node managers

The largest change is in how most services work. The server now contains a list of `NodeManager`s, an idea stolen from the .NET reference SDK, though I've gone further than they do there. Each node manager implements services for a collection of nodes, typically the nodes from one or more namespaces. When a request arrives we give each node manager the request items that belongs to it, so when we call `Read`, for example, a node manager will get the `ReadValueId`s where the `NodeManager` method `owns_node` returns `true`.

There are some exceptions, notably the `view` services can often involve requests that cross node-manager boundaries. Even with this, the idea is that this complexity is hidden from the user.

Implementing a node manager from scratch is challenging, see `node_manager/memory/diagnostics.rs` for an example of a node manager with very limited scope (but one where the visible nodes are dynamic!).

To make it easier for users to develop their own servers, we provide them with a few partially implemented node managers that can be extended:

 - The `InMemoryNodeManager` deals with all non-value attributes, as well as `Browse`, and provides some convenient methods for setting values in the address space. Node managers based on this use the old `AddressSpace`. Each such node manager contains something implementing `InMemoryNodeManagerImpl`, which is a much more reasonable task to implement. See `tests/utils/node_manager.rs` for a very complete example, or `node_manager/memory/core.rs` for a more realistic example (this node manager implements the core namespace, which may also be interesting).
 - The `SimpleNodeManager` is an approximation of the old way to use the SDK. Nodes are added externally, and you can provide getters, setters, and method callbacks. These are no longer part of the address space.

More node managers can absolutely be added if we find good abstractions, but these are solid enough to let us implement what we need for the time being.

# Lost features

Some features are lost, some forever, others until we get around to reimplementing them. I could have held off on this PR until they were all ready, but it's already large enough.

 - Diagnostics are almost entirely gone, though there is a sort of framework for them. In practice the whole thing needs a rethink, and it's an isolated enough task that it made sense to leave out for now.
 - Setting sampling interval to `-1` no longer works. I wanted to make everything work, but in typical OPC-UA fashion some features are just incredibly hard to properly implement in a performant way. I'm open for suggestions for implementing this in a good way, but it's such a niche feature that I felt it was fine to leave it out for now.
 - The web server, as mentioned above.
 - Auditing. Same reason as diagnostics, but also because events are so cumbersome at the moment.

# General improvements

Integration tests are moved into the library as cargo integration tests, and they are quite nice. I can run `cargo test` in about ~30 seconds, most of which is spent on some expensive crypto methods. There is a test harness that allows you to spin up a server using port `0`, meaning that you get dynamically assigned a port, which means we can run tests in parallel arbitrarily.

This almost certainly fixes locka99#359, locka99#358, locka99#324, locka99#291, and locka99#281, and probably more.

# Future work

See `todo.md`, the loose ends mentioned in this PR description need to be tied up, and there is a whole lot of other stuff in that file that would be nice to do.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants