-
Notifications
You must be signed in to change notification settings - Fork 714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate improper TLS RSA certificates early-on #3496
Conversation
bd1c956
to
f4d5987
Compare
network/peer/tls_config_test.go
Outdated
expectedErr: errors.New("no certificates sent by peer"), | ||
}, | ||
{ | ||
description: "No TLS certs given", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description is the same as the previous test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were this change deployed, would it be possible for the rsa certs of existing validators to fail this new validation? If so, what would be the impact?
No, because all I did was moving the code that is called after the TLS handshake, during the TLS handshake when we parse the TLS key of the client but before we use that TLS key. |
network/peer/tls_config_test.go
Outdated
input: func() tls.ConnectionState { | ||
return tls.ConnectionState{} | ||
}, | ||
expectedErr: errors.New("no certificates sent by peer"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any particular reason you don't want to use the error object you've already been allocating ? ( ErrNoCertsSent )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ha, I rushed and forgot that, thanks
network/peer/upgrader_test.go
Outdated
"github.com/ava-labs/avalanchego/staking" | ||
) | ||
|
||
// 8192RSA.pem is used here because it's too expensive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong feeling here, but maybe consider naming the file 8192RSA_test.pem so it would be clear that it's being used for testing purposes only ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say by putting a key in a file, implicitly it shouldn't be used in anything but tests ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed accordingly though
go func() { | ||
defer wg.Done() | ||
conn, err := listener.Accept() | ||
require.NoError(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that this would generate a race error in case an error is returned, since the go testing framework wouldn't allow you failing from an inner go-routine.
to "solve" this issue, I'd suggest using an error channel instead of the waitgroup and test it's result on the main go-routine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Due to the waitgroup, all writes to memory the goroutine has made, would be synchronized by the main goroutine. I tested it with data race detector and no data races were observed in case of an error failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this causes a race condition, but the way that t.FailNow()
works is by calling panic
- which is normally caught by the testing harness managed when running a Test
or in t.Run(...)
but here that panic
will escape the goroutine and cause the test to crash.
I doubt this is the only place in avalanchego where this could happen... so 🤷 on if we should fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider fully validating the peer certificate prior to the handshake rather than just RSA keys?
We currently require:
- TLS certifications are no larger than 2048 bytes.
- A public key is provided
- RSA modulus is eitther 2048 or 4096 bits
- RSA modulus is not even
- RSA exponent is 65537
- ECDSA keys are on the P-256 curve.
- Ed25519 keys are disallowed (will be allowed later)
- DSA keys are disallowed
This PR introduces checks 2-5
, but not 1, 6-8
.
Essentially should we be trying to guarantee that x509 parsing -> custom verification
can never produce a cert that isn't parsable by custom parsing
?
switch rsaKey := pk.(type) { | ||
case *rsa.PublicKey: | ||
return staking.ValidateRSAPublicKeyIsWellFormed(rsaKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely me just being overly defensive, but should we attempt to handle the case that rsaKey
is nil
here?
Test would look like:
{
description: "nil RSA key",
input: func() tls.ConnectionState {
key, err := rsa.GenerateKey(rand.Reader, 2048)
require.NoError(t, err)
x509CertWithNilPK := makeRSACertAndKey(t, key)
x509CertWithNilPK.cert.PublicKey = (*rsa.PublicKey)(nil)
return tls.ConnectionState{PeerCertificates: []*x509.Certificate{&x509CertWithNilPK.cert}}
},
expectedErr: peer.ErrEmptyPublicKey,
},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added test + check in ValidateRSAPublicKeyIsWellFormed
.
description: "Valid TLS cert", | ||
input: func() tls.ConnectionState { | ||
key, err := rsa.GenerateKey(rand.Reader, 2048) | ||
require.NoError(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these functions are referencing the wrong t
. These would call operations on the TestValidateRSACertificate
t
rather than the inner t.Run(...)
t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes you're right. I moved them because of code review comments and didn't notice it.
conn, err := tls.Dial("tcp", listener.Addr().String(), &clientConfig) | ||
require.NoError(t, err) | ||
|
||
require.NoError(t, conn.Handshake()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit surprised that this passes, is this just because the client signature verification is performed last?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In TLS1.3 we first negotiate the encryption, and do the authentication last. The client is the last party to send its authentication, and the implementation doesn't wait for the acknowledgement from the server, i guess for performance reasons.
I "optimistically" pushed a follow-up PR to address 6+7 |
2e73fd5
to
36ca8da
Compare
In this PR I'm just addressing low hanging fruits that make sense to handle early-on to avoid nonsensical exotic configuration that shouldn't be used, such as an avalanchego node not using the RSA verification exponent that is built-in in Golang, or using an exceedingly large modulus of 8K bits. Of course, the space of certificates valid by x509 is a much larger superset than what we need to accept. |
This commit makes validation of TLS certificates with either too big RSA keys, or the wrong exponent, fail as soon as the remote node presents its TLS certificate, in contrast to after the TLS handshake. Signed-off-by: Yacov Manevich <[email protected]>
Signed-off-by: Yacov Manevich <[email protected]>
Why this should be merged
This commit makes validation of TLS certificates with either too big RSA keys, or the wrong exponent, fail as soon as the remote node presents its TLS certificate, in contrast to after the TLS handshake.
How this works
Activates the
VerifyConnection
in thetls.Config
, which makes the TLS handshake code in the golang TLS stack invoke the function on the TLS certificate chain presented by the remote node.How this was tested
Added a unit test to added code.
Added unit tests for existing code, because why not?
Need to be documented in RELEASES.md?
No.