Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc bug fixes #273

Merged
merged 17 commits into from
May 9, 2018
Merged

Misc bug fixes #273

merged 17 commits into from
May 9, 2018

Conversation

carllerche
Copy link
Collaborator

@carllerche carllerche commented May 6, 2018

In an effort to test #261, I added new assertions checking that stream counts is zero when the connection finalizes and that stream state gets released (the slab is empty on drop).

Adding these assertions uncovered a number of bugs. This PR fixes the discovered bugs.

I would recommend reviewing each commit individually as they are pretty independent.

When the HTTP/2.0 connection finalizes, this implies all associates
streams must have been completed (either successfully or forcefully).
This implies that the internal stream count has returned to zero.

This patch adds a debug level assertion performing this check.
This patch fixes a bug where stream counts were not decremented for
streams that closed due to a reset frame sent by the local peer that got
stored in the prioritization queue.
The transition function did not correctly handle a stream going from
"reserved" to counted. This modifies the function to detect a count
increment in order to correctly decrement if needed.
Previously, the connection's current number of active streams is
decremented based on a number of conditions. This logic is fairly error
prone.

This patch adds a new field to the stream state: `is_counted`. When a
stream results in the number of active streams to be incremented, the
stream state tracks this as a bool. Then, during a state transition that
could result in the number of active streams to be decremented, this
bool can be checked.
A GO_AWAY frame is sent before the connection closes.
This patch adds a debug level assertion checking that state associated
with streams is successfully released. This is done by ensuring that no
stream state remains when the connection inner state is dropped. This
does not happen until all handles to the connection drops, which should
result in all streams being dropped.

This also pulls in a fix for one bug that is exposed with this change.
The fix is first proposed as part of #261.
Being stored in a queue prevents stream state from being released. Once
the stream is popped from the queue, `counts.transition` must be called
to release the stream state if the queue was the final reference to the
stream.
This is yet another queue that can hold on to the stream ref count.
If a connection handle is dropped early, it can result in stream handles
hanging. This patch clears queues and notifies open streams when the
connection handle is dropped.
`Streams` can be cloned to get multiple handles to the inner state.
@carllerche
Copy link
Collaborator Author

Still looks like there are bugs!

B: IntoBuf,
{
fn drop(&mut self) {
self.streams.recv_eof();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is dropped because the thread panicked, this will trigger an abort (recv_eof unwraps a lock).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, we can have recv_eof() return Err instead of unwrapping the poisoned mutex.

assert!(self.can_inc_num_recv_streams());
assert!(!stream.is_counted);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question applies for most of the new asserts: do we want to be adding new asserts everywhere, or just debug_assert? Is it better for this to crash in production?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it depends on the severity of the assertion failure.

If this fails, then we could get into situations where subtractions wrap around. I think this should be guarded again?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any further thoughts on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so of course, crashing in production is a major bummer, as it destroys all in progress requests... In this specific case, it seems like if this assert were to fail, then the subtraction would eventually underflow and crash anyways. So perhaps this is fine...

(It'd probably be neat if we could just encode this in the types somehow instead, but 🤷)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.... ideally it would only invalidate the connection.

That said, we have a whole bunch of existing assert! statements. Maybe we should punt this as an open question: "how to better deal w/ internal errors".

@carllerche carllerche merged commit cf62b78 into master May 9, 2018
@carllerche carllerche deleted the misc-fixes branch May 21, 2018 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants