-
-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
recv_reset resets closed streams with queued EOS frames #247
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for digging into this. The idea of the fix looks good to me.
As discussed on Slack, I believe that instead of transitioning to Closed(Scheduled)
, we would want to transition to some other state. This is because, ideally, we would not send out a RST_STREAM.
tests/stream_states.rs
Outdated
.idle_ms(1) | ||
// Send the RST_STREAM frame which causes the client to panic. | ||
.send_frame(frames::reset(1).cancel()) | ||
.recv_frame(frames::reset(1).cancel()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that, ideally, we would avoid sending a RST_STREAM. While the peer should be able to tolerate this, it is not necessary as a RST_STREAM
in either direction results in the stream transitioning to closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think I misunderstood a sentence in the spec saying that we should allow this as saying that we should do this. Had I known that earlier, this PR would have been finished much quicker. My bad!
Corrected this behaviour in 861b109.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one request to remove commented out code.
@seanmonstar thoughts?
src/proto/streams/state.rs
Outdated
@@ -317,6 +324,14 @@ impl State { | |||
} | |||
} | |||
|
|||
// /// Returns true if the stream was closed by sending an EOS. | |||
// pub fn is_eos(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I meant to mention this in the first review, would it be possible to remove this commented out code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yeah, missed that when committing. Sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\o/
@@ -590,7 +590,7 @@ impl Recv { | |||
/// Handle remote sending an explicit RST_STREAM. | |||
pub fn recv_reset(&mut self, frame: frame::Reset, stream: &mut Stream) { | |||
// Notify the stream | |||
stream.state.recv_reset(frame.reason()); | |||
stream.state.recv_reset(frame.reason(), stream.is_pending_send); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to eventually make the state
private. It reads weird that we need to pass in extra information about the stream itself. (Not your fault, I'm just pining for a better future some day :D).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, there doesn't appear to be any harm in making the forcing the transition for all Closed(Cause::EndStream)
states on receipt of RST_STREAM regardless of whether or not the stream has pending send; but @carllerche suggested that we only transition to Closed(Cause::Proto)
if the stream is queued.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to stay on the side of conservative and only make changes that we know are needed.
If there are no queued send frames, then the stream state is actually closed and the received reset should be ignored (I think this is already done)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I was clear. I get why we're using the queued information, that's fine! We don't want to trigger a reset error for a stream the user has already closed locally.
I was more concerned that it's weird that we pass information of the stream as an argument onto the state property. I'd want to one day change the methods on the state
to be methods on stream
instead.
In other words, ignore me. I'm just sleep deprived. Move along, nothing to see here!
// Idling for a moment here is necessary to ensure that the client | ||
// enqueues its TRAILERS frame *before* we send the RST_STREAM frame | ||
// which causes the panic. | ||
.idle_ms(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this racey at all? Maybe we need a way to tell the mocks to wait on a future, to allow for synchronization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be potentially racy; changing the mocks to allow waiting on a future seems like a better approach long-term.
// If the stream has a queued EOS frame, transition to peer | ||
// reset. | ||
trace!("recv_reset: reason={:?}; queued=true", reason); | ||
self.inner = Closed(Cause::Proto(reason)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this fixes it because while the stream is still queued, since it's state is changing, the queue later on skips this frame, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct.
Fixes #246. Fixes linkerd/linkerd2#607.
An issue currently exists where
h2
will panic when a stream transitions fromHalfClosed(remote)
toClosed(Cause::EndStream)
by enqueuing an EOS frame, and then receives aRST_STREAM
frame before the queued EOS frame is sent.In this case, the
expect
inprioritize::pop_frame
https://github.com/carllerche/h2/blob/f8baeb7211985834acaf7ecc70548aafd5d9ef6d/src/proto/streams/prioritize.rs#L693-L694 is called on a frame which does not have a scheduled reset, because it is in theClosed(Cause::EndStream)
state. However, the guard which skips this code when the stream is peer reset https://github.com/carllerche/h2/blob/f8baeb7211985834acaf7ecc70548aafd5d9ef6d/src/proto/streams/prioritize.rs#L584-L586 doesn't cause us to skip in this case, because the stream is in theClosed(Cause::EndStream)
state.I've modified
streams::State::recv_reset
to take a boolean parameter indicating whether the stream receiving a reset currently has a frame queued to send. If the stream is in theClosed(Cause::EndStream)
state, rather than doing nothing, we check whether there is a queued frame. If so, the stream transitions toClosed(Cause::Proto)
with the reason for the received reset set as the reason.This way, when
prioritize::pop_frame
sees this stream, it will be treated by any other reset stream, draining the queue and transitioning to reset. We no longer hit theexpect
call that was causing panics in this case.I've also added a unit test that reproduces the crash against master.