Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graceful shutdown: terminate on error #1218

Merged
merged 3 commits into from
Oct 19, 2023

Conversation

niklasad1
Copy link
Member

@niklasad1 niklasad1 commented Oct 18, 2023

This fixes a soketto::Receiver drop issue where two streams were created and this PR uses the same stream for both receiving messages and shutdown.

In addition, this PR reverts the behavior to keep the socket alive despite errors and the server will only perform the graceful shutdown if a shutdown was requested.

master:

2023-10-19T08:46:52.456649Z  INFO jsonrpsee_server::transport::ws: server is closed because: Ok(Stopped)
// This is not a message that is sent by the client i.e, read as empty......
2023-10-19T08:46:52.456664Z TRACE soketto::connection: 5627ceca: recv: (Continue (fin 0) (rsv 000) (mask (0 0)) (len 0))
2023-10-19T08:46:52.456671Z DEBUG soketto::connection: 5627ceca: continue frame while not processing message fragments
2023-10-19T08:46:52.456678Z  WARN jsonrpsee_server::transport::ws: Graceful shutdown got WebSocket error: unexpected opcode: Continue but polling until the connection closed or all pending calls has been executed
// Websocket Close handshake .... 

This branch:

2023-10-19T08:52:19.060822Z  INFO jsonrpsee_server::transport::ws: server is closed because: Ok(Stopped)
2023-10-19T08:52:19.162134Z TRACE method_call{method="sleep_20s"}: jsonrpsee_core::tracing: send="{\"jsonrpc\":\"2.0\",\"id\":10,\"method\":\"sleep_20s\"}"
2023-10-19T08:52:19.162269Z TRACE jsonrpsee_client_transport::ws: send: {"jsonrpc":"2.0","id":10,"method":"sleep_20s"}
2023-10-19T08:52:19.162301Z TRACE soketto::connection: 2425d827: send: (Text (fin 1) (rsv 000) (mask (1 1e4e0b66)) (len 46))
2023-10-19T08:52:19.162341Z TRACE soketto::connection: 2425d827: Sender flushing connection
2023-10-19T08:52:19.162486Z TRACE soketto: read 2 bytes
2023-10-19T08:52:19.162511Z TRACE soketto: read 4 bytes
2023-10-19T08:52:19.162524Z TRACE soketto::connection: 33584b71: recv: (Text (fin 1) (rsv 000) (mask (1 1e4e0b66)) (len 46))
// Websocket Close handshake happend but no empty message is read above....

On errors we just terminate the connection for simplicity as it did before.

@niklasad1 niklasad1 requested a review from a team as a code owner October 18, 2023 11:59
server/src/transport/ws.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@jsdw jsdw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where was the cancel safety issue? Offhand I can't see anything problematic in the old code so I'm probably missing something!

But the code looks good anyways!

@niklasad1
Copy link
Member Author

niklasad1 commented Oct 18, 2023

Not sure whether it's a cancel-safety issue but something weird is happening once we create a new stream on every receive operation in try_recv and finally in graceful_shutdown

master:

2023-10-19T08:46:52.456649Z  INFO jsonrpsee_server::transport::ws: server is closed because: Ok(Stopped)
// This is not a message that is sent by the client i.e, read as empty......
2023-10-19T08:46:52.456664Z TRACE soketto::connection: 5627ceca: recv: (Continue (fin 0) (rsv 000) (mask (0 0)) (len 0))
2023-10-19T08:46:52.456671Z DEBUG soketto::connection: 5627ceca: continue frame while not processing message fragments
2023-10-19T08:46:52.456678Z  WARN jsonrpsee_server::transport::ws: Graceful shutdown got WebSocket error: unexpected opcode: Continue but polling until the connection closed or all pending calls has been executed
// Websocket Close handshake .... 

This branch:

2023-10-19T08:52:19.060822Z  INFO jsonrpsee_server::transport::ws: server is closed because: Ok(Stopped)
2023-10-19T08:52:19.162134Z TRACE method_call{method="sleep_20s"}: jsonrpsee_core::tracing: send="{\"jsonrpc\":\"2.0\",\"id\":10,\"method\":\"sleep_20s\"}"
2023-10-19T08:52:19.162269Z TRACE jsonrpsee_client_transport::ws: send: {"jsonrpc":"2.0","id":10,"method":"sleep_20s"}
2023-10-19T08:52:19.162301Z TRACE soketto::connection: 2425d827: send: (Text (fin 1) (rsv 000) (mask (1 1e4e0b66)) (len 46))
2023-10-19T08:52:19.162341Z TRACE soketto::connection: 2425d827: Sender flushing connection
2023-10-19T08:52:19.162486Z TRACE soketto: read 2 bytes
2023-10-19T08:52:19.162511Z TRACE soketto: read 4 bytes
2023-10-19T08:52:19.162524Z TRACE soketto::connection: 33584b71: recv: (Text (fin 1) (rsv 000) (mask (1 1e4e0b66)) (len 46))
// Websocket Close handshake happend but no empty message is read above....

If you search the logs you can find that an empty WebSocket message is read by soketto which is a red-flag i.e,
recv: (Continue (fin 0) (rsv 000) (mask (0 0)) (len 0))

@jsdw
Copy link
Collaborator

jsdw commented Oct 19, 2023

Aah I can see what might have been an issue now yeah; receiver.receive() isn't cancel safe, but we create a stream over it in try_recv and then can there are a few cases that can lead to the receiver.receive() future being dropped midway through, and then later in eg graceful_shutdown we start receiving again but maybe from a weird/broken state because some of the data has lready been puleld into the receiver.receive() future which was dropped.

So yeah keeping the same stream for everything makes a lot of sense, so that we aren't throwing away any data as long as the stream is reused!

@niklasad1 niklasad1 merged commit 10cf1b5 into master Oct 19, 2023
@niklasad1 niklasad1 deleted the na-graceful-shutdown-terminate-on-err branch October 19, 2023 10:01
niklasad1 added a commit that referenced this pull request Oct 24, 2023
* graceful shutdown: terminate on error

* fix nits

* adress grumbles
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants