Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recv_reset resets closed streams with queued EOS frames #247

Merged
merged 7 commits into from
Mar 28, 2018

Conversation

hawkw
Copy link
Collaborator

@hawkw hawkw commented Mar 27, 2018

Fixes #246. Fixes linkerd/linkerd2#607.

An issue currently exists where h2 will panic when a stream transitions from HalfClosed(remote) to Closed(Cause::EndStream) by enqueuing an EOS frame, and then receives a RST_STREAM frame before the queued EOS frame is sent.

In this case, the expect in prioritize::pop_frame https://github.com/carllerche/h2/blob/f8baeb7211985834acaf7ecc70548aafd5d9ef6d/src/proto/streams/prioritize.rs#L693-L694 is called on a frame which does not have a scheduled reset, because it is in the Closed(Cause::EndStream) state. However, the guard which skips this code when the stream is peer reset https://github.com/carllerche/h2/blob/f8baeb7211985834acaf7ecc70548aafd5d9ef6d/src/proto/streams/prioritize.rs#L584-L586 doesn't cause us to skip in this case, because the stream is in the Closed(Cause::EndStream) state.


I've modified streams::State::recv_reset to take a boolean parameter indicating whether the stream receiving a reset currently has a frame queued to send. If the stream is in the Closed(Cause::EndStream) state, rather than doing nothing, we check whether there is a queued frame. If so, the stream transitions to Closed(Cause::Proto) with the reason for the received reset set as the reason.

This way, when prioritize::pop_frame sees this stream, it will be treated by any other reset stream, draining the queue and transitioning to reset. We no longer hit the expect call that was causing panics in this case.

I've also added a unit test that reproduces the crash against master.

@hawkw hawkw added the bug label Mar 27, 2018
@hawkw hawkw self-assigned this Mar 27, 2018
Copy link
Collaborator

@carllerche carllerche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into this. The idea of the fix looks good to me.

As discussed on Slack, I believe that instead of transitioning to Closed(Scheduled), we would want to transition to some other state. This is because, ideally, we would not send out a RST_STREAM.

.idle_ms(1)
// Send the RST_STREAM frame which causes the client to panic.
.send_frame(frames::reset(1).cancel())
.recv_frame(frames::reset(1).cancel())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that, ideally, we would avoid sending a RST_STREAM. While the peer should be able to tolerate this, it is not necessary as a RST_STREAM in either direction results in the stream transitioning to closed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I misunderstood a sentence in the spec saying that we should allow this as saying that we should do this. Had I known that earlier, this PR would have been finished much quicker. My bad!

Corrected this behaviour in 861b109.

@hawkw hawkw changed the title recv_reset reschedules closed streams with queued EOS frames to reset recv_reset resets closed streams with queued EOS frames Mar 27, 2018
Copy link
Collaborator

@carllerche carllerche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one request to remove commented out code.

@seanmonstar thoughts?

@@ -317,6 +324,14 @@ impl State {
}
}

// /// Returns true if the stream was closed by sending an EOS.
// pub fn is_eos(&self) -> bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I meant to mention this in the first review, would it be possible to remove this commented out code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, missed that when committing. Sorry.

@hawkw hawkw dismissed carllerche’s stale review March 27, 2018 21:35

removed commented out code

Copy link
Member

@seanmonstar seanmonstar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\o/

@@ -590,7 +590,7 @@ impl Recv {
/// Handle remote sending an explicit RST_STREAM.
pub fn recv_reset(&mut self, frame: frame::Reset, stream: &mut Stream) {
// Notify the stream
stream.state.recv_reset(frame.reason());
stream.state.recv_reset(frame.reason(), stream.is_pending_send);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to eventually make the state private. It reads weird that we need to pass in extra information about the stream itself. (Not your fault, I'm just pining for a better future some day :D).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, there doesn't appear to be any harm in making the forcing the transition for all Closed(Cause::EndStream) states on receipt of RST_STREAM regardless of whether or not the stream has pending send; but @carllerche suggested that we only transition to Closed(Cause::Proto) if the stream is queued.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to stay on the side of conservative and only make changes that we know are needed.

If there are no queued send frames, then the stream state is actually closed and the received reset should be ignored (I think this is already done)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I was clear. I get why we're using the queued information, that's fine! We don't want to trigger a reset error for a stream the user has already closed locally.

I was more concerned that it's weird that we pass information of the stream as an argument onto the state property. I'd want to one day change the methods on the state to be methods on stream instead.

In other words, ignore me. I'm just sleep deprived. Move along, nothing to see here!

// Idling for a moment here is necessary to ensure that the client
// enqueues its TRAILERS frame *before* we send the RST_STREAM frame
// which causes the panic.
.idle_ms(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this racey at all? Maybe we need a way to tell the mocks to wait on a future, to allow for synchronization?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be potentially racy; changing the mocks to allow waiting on a future seems like a better approach long-term.

// If the stream has a queued EOS frame, transition to peer
// reset.
trace!("recv_reset: reason={:?}; queued=true", reason);
self.inner = Closed(Cause::Proto(reason));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this fixes it because while the stream is still queued, since it's state is changing, the queue later on skips this frame, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants