Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47172][CORE] Add support for AES-GCM for RPC encryption #46515

Closed
wants to merge 8 commits into from

Conversation

sweisdb
Copy link
Contributor

@sweisdb sweisdb commented May 9, 2024

What changes were proposed in this pull request?

This change adds AES-GCM as an optional AES cipher mode for RPC encryption. The current default is using AES-CTR without any authentication. That would allow someone on the network to easily modify RPC contents on the wire and impact Spark behavior. See SPARK-47172 for more details.

Why are the changes needed?

The current default is using AES-CTR without any authentication. That would allow someone on the network to easily modify RPC contents on the wire and impact Spark behavior.

Does this PR introduce any user-facing change?

Yes, it adds an additional configuration flag is reflected in the documentation.

How was this patch tested?

Existing unit tests are all ensured to pass. New unit tests are written to explicitly test GCM support and to verify that modifying ciphertext content will cause an exception and fail.

build/sbt "network-common/test:testOnly"
build/sbt "network-common/test:testOnly org.apache.spark.network.crypto.AuthIntegrationSuite"
build/sbt "network-common/test:testOnly org.apache.spark.network.crypto.AuthEngineSuite"

Was this patch authored or co-authored using generative AI tooling?

Nope.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-47172] Add support for AES-GCM for RPC encryption [SPARK-47172][CORE] Add support for AES-GCM for RPC encryption May 9, 2024
@dongjoon-hyun
Copy link
Member

cc @mridulm

@mridulm
Copy link
Contributor

mridulm commented May 14, 2024

Took a quick pass through it, sorry for the delay.

+CC @JoshRosen as well.

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick pass, not yet looked through tests and doc.
Thanks for working on this @sweisdb !

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a pass through it.
Can you also address pending comments please ? Thanks

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments, it is looking pretty close to being done.

+CC @Ngone51, @JoshRosen as well - I would really like more eyes on this given it is a backport candidate for 3.x !

Cipher mode to use. Defaults "AES/CTR/NoPadding" for backward compatibility, which is not authenticated.
Recommended to use "AES/GCM/NoPadding", which is an authenticated encryption mode.
</td>
<td>4.0.0</td>
Copy link
Member

@Ngone51 Ngone51 Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mridulm Can we still add the new feature since 4.0.0-preview1 has been passed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a preview release, I am sure we will have additional new features landing and existing behavior evolving before 4.0 RC gets cut :-)

Note that this specific PR will also get backported to 3.5 and 3.4 - given the security concerns it is mitigating.
So would love to hear your thoughts @Ngone51 - particularly with the intent to derisk the change.

If you need more context on why we need to backport this to 3.5/3.4, @sweisdb could provide you more context via email.

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for adding support for this @sweisdb !

I will keep this open for a few days in case there are additional review comments.
+CC @dongjoon-hyun as well, given the plans to backport it to 3.5 and 3.4

@mridulm
Copy link
Contributor

mridulm commented Jun 18, 2024

@Ngone51, did you get a chance to review this ? (not sure if you synced with @sweisdb offline :-) )
Given this has to go into released branches, I want to make sure we have more eyes on it.

return transferredThisCall;
}
}
boolean lastSegment = getReadableBytes() == 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to track the segment read, shouldn't it be bytesRead == aesGcmHkdfStreaming.getPlaintextSegmentSize()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would almost be correct, except sometimes the bytesRead may be segment aligned. You may be on the last segment and bytesRead could be less than or equal to a full segment.

I would use readableBytes() being empty as the clear indication that we just read the last segment worth of data.

Copy link
Member

@Ngone51 Ngone51 Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I misunderstood it. It is last the one segment rather than the previous one segment. Then it looks good to me.

Copy link
Member

@Ngone51 Ngone51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with two minor comments.

decrypterInit = true;
ciphertextRead += aesGcmHkdfStreaming.getHeaderLength();
if (expectedLength == ciphertextRead) {
// If the expected length is just the header, the ciphertext is 0 length.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my curiosity: does this really happen in the real world?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell you whether there will never be a 0-length message sent in Spark. We probably want to be able to support that case if it ever happens upstream rather than throw an error here, right?

nettyBufReadableBytes,
ciphertextBuffer.remaining());
int expectedRemaining = (int) (expectedLength - ciphertextRead);
int bytesToRead = Integer.min(readableBytes, expectedRemaining);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Correct me if I'm wrong. It looks like it is an invalid case if readableBytes > expectedRemaining here. So we can actually fail eariler before reading the buffer.

@mridulm
Copy link
Contributor

mridulm commented Jun 21, 2024

@Ngone51 , can you merge it to master/branch-3.5 and branch-3.4 once you are done with the reviews ?
I might be afk, and dont want to block on this PR getting committed. Thanks !

@Ngone51
Copy link
Member

Ngone51 commented Jun 21, 2024

@mridulm Sure, will do.

@Ngone51 Ngone51 closed this in f0563ef Jun 21, 2024
@Ngone51
Copy link
Member

Ngone51 commented Jun 21, 2024

Thanks, merged to master!

There are conflicts with branch-3.5/branch-3.4. @sweisdb Could you open the separate PRs for 3.5/3.4? Thanks!

sweisdb added a commit to sweisdb/spark that referenced this pull request Jun 21, 2024
This change adds AES-GCM as an optional AES cipher mode for RPC encryption. The current default is using AES-CTR without any authentication. That would allow someone on the network to easily modify RPC contents on the wire and impact Spark behavior. See [SPARK-47172](https://issues.apache.org/jira/browse/SPARK-47172) for more details.

The current default is using AES-CTR without any authentication. That would allow someone on the network to easily modify RPC contents on the wire and impact Spark behavior.

Yes, it adds an additional configuration flag is reflected in the documentation.

Existing unit tests are all ensured to pass. New unit tests are written to explicitly test GCM support and to verify that modifying ciphertext content will cause an exception and fail.

`build/sbt "network-common/test:testOnly"`
`build/sbt "network-common/test:testOnly org.apache.spark.network.crypto.AuthIntegrationSuite"`
`build/sbt "network-common/test:testOnly org.apache.spark.network.crypto.AuthEngineSuite"`

Nope.

Closes apache#46515 from sweisdb/SPARK-47172.

Authored-by: Steve Weis <[email protected]>
Signed-off-by: Yi Wu <[email protected]>
@sweisdb
Copy link
Contributor Author

sweisdb commented Jun 21, 2024

@Ngone51 I created two backport pull requests here:
#47060
#47061

sweisdb added a commit to sweisdb/spark that referenced this pull request Jun 25, 2024
This change adds AES-GCM as an optional AES cipher mode for RPC encryption. The current default is using AES-CTR without any authentication. That would allow someone on the network to easily modify RPC contents on the wire and impact Spark behavior. See [SPARK-47172](https://issues.apache.org/jira/browse/SPARK-47172) for more details.

The current default is using AES-CTR without any authentication. That would allow someone on the network to easily modify RPC contents on the wire and impact Spark behavior.

Yes, it adds an additional configuration flag is reflected in the documentation.

Existing unit tests are all ensured to pass. New unit tests are written to explicitly test GCM support and to verify that modifying ciphertext content will cause an exception and fail.

`build/sbt "network-common/test:testOnly"`
`build/sbt "network-common/test:testOnly org.apache.spark.network.crypto.AuthIntegrationSuite"`
`build/sbt "network-common/test:testOnly org.apache.spark.network.crypto.AuthEngineSuite"`

Nope.

Closes apache#46515 from sweisdb/SPARK-47172.

Authored-by: Steve Weis <[email protected]>
Signed-off-by: Yi Wu <[email protected]>
* This method is for testing purposes only.
*/
@VisibleForTesting
public String getKeyId() throws GeneralSecurityException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the method comments, this is a method for testing purposes only. Why is it defined as public?

attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
### What changes were proposed in this pull request?

This change adds AES-GCM as an optional AES cipher mode for RPC encryption. The current default is using AES-CTR without any authentication. That would allow someone on the network to easily modify RPC contents on the wire and impact Spark behavior. See [SPARK-47172](https://issues.apache.org/jira/browse/SPARK-47172) for more details.

### Why are the changes needed?

The current default is using AES-CTR without any authentication. That would allow someone on the network to easily modify RPC contents on the wire and impact Spark behavior.

### Does this PR introduce _any_ user-facing change?

Yes, it adds an additional configuration flag is reflected in the documentation.

### How was this patch tested?
Existing unit tests are all ensured to pass. New unit tests are written to explicitly test GCM support and to verify that modifying ciphertext content will cause an exception and fail.

`build/sbt "network-common/test:testOnly"`
`build/sbt "network-common/test:testOnly org.apache.spark.network.crypto.AuthIntegrationSuite"`
`build/sbt "network-common/test:testOnly org.apache.spark.network.crypto.AuthEngineSuite"`

### Was this patch authored or co-authored using generative AI tooling?
Nope.

Closes apache#46515 from sweisdb/SPARK-47172.

Authored-by: Steve Weis <[email protected]>
Signed-off-by: Yi Wu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants