Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

occasional h2spec failures on jenkins #4855

Closed
lachlan-roberts opened this issue May 8, 2020 · 13 comments · Fixed by #4946 or #4950
Closed

occasional h2spec failures on jenkins #4855

lachlan-roberts opened this issue May 8, 2020 · 13 comments · Fixed by #4946 or #4950
Labels

Comments

@lachlan-roberts
Copy link
Contributor

Jetty version
jetty-10.0.x

Java version
14.0.1, vendor: AdoptOpenJDK

OS type/version
"linux", version: "4.15.0-99-generic", arch: "amd64", family: "unix"

Description

Occasional h2spec failures on jenkins, I've seen this a few times now but it usually disappears if you re-run the build.

https://jenkins.webtide.net/blue/organizations/jenkins/jetty.project/detail/jetty-9.4.x-4789-ShutdownThread/1/pipeline

Hypertext Transfer Protocol Version 2 (HTTP/2)
  5. Streams and Multiplexing
    5.1. Stream States
      × 11: closed: Sends a DATA frame
        -> The endpoint MUST treat this as a connection error of type STREAM_CLOSED.
           Expected: GOAWAY Frame (Error Code: STREAM_CLOSED)
                     RST_STREAM Frame (Error Code: STREAM_CLOSED)
                     Connection closed
             Actual: Timeout

Finished in 2.3936 seconds
145 tests, 144 passed, 0 skipped, 1 failed
@lachlan-roberts
Copy link
Contributor Author

got another failure for this on the 9.4.x build
https://jenkins.webtide.net/blue/organizations/jenkins/jetty.project/detail/jetty-9.4.x/1531/pipeline

Hypertext Transfer Protocol Version 2 (HTTP/2)
 8. HTTP Message Exchanges
   8.1. HTTP Request/Response Exchange
     8.1.2. HTTP Header Fields
       8.1.2.6. Malformed Requests and Responses
         × 2: Sends a HEADERS frame with the "content-length" header field which does not equal the sum of the multiple DATA frames payload length
           -> The endpoint MUST treat this as a stream error of type PROTOCOL_ERROR.
              Expected: GOAWAY Frame (Error Code: PROTOCOL_ERROR)
                        RST_STREAM Frame (Error Code: PROTOCOL_ERROR)
                        Connection closed
                Actual: DATA Frame (length:436, flags:0x01, stream_id:1)

@lorban
Copy link
Contributor

lorban commented May 28, 2020

For reference, the second failure reported by @lachlan-roberts is also appearing as:

Failed test cases:
	[8.1.2.6 - Sends a HEADERS frame with the "content-length" header field which does not equal the sum of the multiple DATA frames payload length] failed
 Expected: GOAWAY Frame (Error Code: PROTOCOL_ERROR)
 Actual: RST_STREAM Frame (Error Code: PROTOCOL_ERROR)

Both failures do correspond to this test: https://github.com/summerwind/h2spec/blob/v2.2.0/http2/8_1_2_6_malformed_requests_and_responses.go#L70

@sbordet
Copy link
Contributor

sbordet commented May 28, 2020

@lorban can you please check if there is a new version of h2spec and possibly update it?

I was never able to fail h2spec locally, no idea why it fails on Jenkins.

@olamy
Copy link
Member

olamy commented May 28, 2020

2.4.0 is available (https://github.com/summerwind/h2spec/releases) but this downloaded and build within the plugin here https://github.com/madgnome/h2spec-maven-plugin.
This plugin is not active maybe we should have our own fork?

@sbordet
Copy link
Contributor

sbordet commented May 28, 2020

@olamy last time I put up a PR and it was merged and a release done. Perhaps that would be enough, rather than forking.

@sbordet
Copy link
Contributor

sbordet commented May 28, 2020

@olamy cancel that, my issue is still open :|

@sbordet
Copy link
Contributor

sbordet commented May 28, 2020

can we run it without a maven plugin? just exec plugin maybe, running h2spec directly?

@olamy
Copy link
Member

olamy commented May 28, 2020

the problem is to store the specs binaries... we don't really want to download them for each build or store them in our jetty.project git repo.

@olamy
Copy link
Member

olamy commented May 29, 2020

branch jetty-9.4.x_h2s_spec_2.4.0 use fork of the plugin https://github.com/jetty-project/h2spec-maven-plugin. updated with spec 2.4.0.

@lorban
Copy link
Contributor

lorban commented Jun 5, 2020

I've modified h2spec to make it report the client's address/port of each failing test: summerwind/h2spec#114

sbordet added a commit that referenced this issue Jun 5, 2020
Code cleanups.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jun 5, 2020
In case of bad usage of the API, we don't want to close()
the stream but just fail the callback, because the stream
may be performing actions triggered by a legit API usage.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jun 5, 2020
In case of a call to `AsyncListener.onError()`, applications may decide to call
AsyncContext.complete() and that would be a correct usage of the Servlet API.

This case was not well handled and was wrongly producing a WARN log with an
`IllegalStateException`.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jun 5, 2020
Completely rewritten `HttpTransportOverHTTP2.TransportCallback`.
The rewrite handles correctly asynchronous failures that now are executed
sequentially (and not concurrently) with writes.
If a write is in progress, the failure will just change the state and at the
end of the write a check on the state will determine what actions to take.

A session failure is now handled in HTTP2Session by first failing all the
streams - which notifies the Stream.Listeners - and then failing the session
- which notifies the Session.Listener.
The stream failures are executed concurrently by dispatching each one to a
different thread; this means that the stream failure callbacks are executed
concurrently (likely sending RST_STREAM frames).
The session failure callback is completed only when all the stream failure
callbacks have completed, to ensure that a GOAWAY frame is processed after
all the RST_STREAM frames.

Signed-off-by: Simone Bordet <[email protected]>
@sbordet sbordet linked a pull request Jun 5, 2020 that will close this issue
@sbordet
Copy link
Contributor

sbordet commented Jun 5, 2020

The main issue that was causing h2spec failures was the following.

The client (h2spec) was sending an invalid frame after legit frames that triggered an HTTP response.

While the response frame were being generated, the invalid frame was read concurrently and eventually call HttpTransportOverHTTP2.TransportCallback.failed().

There, the TransportCallback had a nested HttpChannel.SendCallback because a write was initiated but it was actually in progress of generating the frames.

One such frame is the DATA frame for the response content, which had a reference to HttpOutput._aggregate buffer.

The frame generation would look at the _aggregate buffer, determine that there are N bytes to write and produce a frame header with body length N.

Concurrently, failing the TransportCallback fails the nested HttpChannel.SendCallback, which eventually clears the _aggregate buffer.

When the DATA frame is written on the network, the frame header says that N bytes will follow, but since the _aggregate buffer has been cleared, those N bytes are not written.

After that, the server may write some other frame such as a RST_STREAM or GOAWAY frame.

The client sees a DATA frame header saying that N bytes of content will come, but instead it sees 0 bytes of content and a smaller number of bytes - the RST_STREAM or GOAWAY frames written by the server - that are interpreted as content but are too few and causes the client to fail with an early EOF.

The solution is that the TransportCallback must run the failed() logic sequentially with respect to the send() operation: either before or after, but not concurrently with the frame generation to avoid the concurrent access to the HttpOutput._aggregate buffer.

sbordet added a commit that referenced this issue Jun 5, 2020
Fixed notification of HTTP2Session.abort(), that must fail all the streams
before failing the session.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jun 8, 2020
Added javadocs after review.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jun 8, 2020
More comments after review.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jun 8, 2020
Updates after review.

Signed-off-by: Simone Bordet <[email protected]>
@lorban
Copy link
Contributor

lorban commented Jun 8, 2020

There's another race condition (6.9.2. Initial Flow-Control Window Size) that seems to be in the h2spec tool itself this time: summerwind/h2spec#115

sbordet added a commit that referenced this issue Jun 8, 2020
Updates after review.

Signed-off-by: Simone Bordet <[email protected]>
@gregw gregw linked a pull request Jun 9, 2020 that will close this issue
gregw pushed a commit that referenced this issue Jun 9, 2020
@gregw
Copy link
Contributor

gregw commented Jun 9, 2020

Merged PR didn't fix the issue, it just disabled the test

@gregw gregw reopened this Jun 9, 2020
sbordet added a commit that referenced this issue Jun 9, 2020
Fixed reset() to remember the failure.

Signed-off-by: Simone Bordet <[email protected]>
@sbordet sbordet closed this as completed in 56bda1b Jun 9, 2020
sbordet added a commit that referenced this issue Jun 26, 2020
Partially reverted the changes introduced in #4855, because they
were working only when sends were synchronous.

Introduced ByteBufferPool.remove(ByteBuffer) to fix the issue.
Now when a concurrent failure happens while frames are being
generated or sent, the buffer is discarded instead of being
recycled, therefore resolving the buffer corruption.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jul 3, 2020
Fixes #4967 - Possible buffer corruption in HTTP/2 session failures

Partially reverted the changes introduced in #4855, because they
were working only when sends were synchronous.

Introduced ByteBufferPool.remove(ByteBuffer) to fix the issue.
Now when a concurrent failure happens while frames are being
generated or sent, the buffer is discarded instead of being
recycled, therefore resolving the buffer corruption.

Signed-off-by: Simone Bordet <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants