-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-11227. Use server default key provider to encrypt/decrypt keys from multiple OMs. #7081
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this PR self contained? I.e. it looks like there should be extra code for the client to call getServerDefaults() and determine the key provider path to be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also if you intend to implement Hadoop's getServerDefaults() FileSystem API, you'd need to use Hadoop's FsServerDefaults class rather than creating a new one. Or at least, extends Hadoop's FsServerDefaults base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest change makes sense. But you'll have to move the getServerDefaults() from OzoneFileSystem/RootedOzoneFileSystem to BasicOzoneFileSystem/BasicRootedOzoneFileSystem. Unfortunately we have this mysterious many level architecture to support Hadoop 2/3 runtime. As a rule of thumb if a FileSystem API exists for a long time (such as getServerDefaults()) they most likely exist in Hadoop 2 as well so BasicOzoneFileSystem/BasicRootedOzoneFileSystem is the right place to add them.
@@ -67,6 +68,11 @@ public RootedOzoneFileSystem() { | |||
forceRecovery = Strings.isNullOrEmpty(force) ? false : Boolean.parseBoolean(force); | |||
} | |||
|
|||
@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to BasicRootedOzoneFileSystem
@@ -65,6 +66,11 @@ public OzoneFileSystem() { | |||
forceRecovery = Strings.isNullOrEmpty(force) ? false : Boolean.parseBoolean(force); | |||
} | |||
|
|||
@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to BasicOzoneFileSystem
The client call getKeyProviderUri() which internally calls the getServerDetails() method to get the URI from the OzoneManager. This is implemented similarly in HDFS https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java#L3180-L3183 |
86196a2
to
fb43023
Compare
… default key provider to encrypt/decrypt keys from multiple OMs.
…nd BasicOzoneFileSystem
@fapifta could you please take a look as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good. A little more test coverage would be even better.
@@ -2591,11 +2603,22 @@ public KeyProvider call() throws Exception { | |||
} | |||
} | |||
|
|||
@Override | |||
public OzoneFsServerDefaults getServerDefaults() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be great to have a test to verify this where it makes sure a second call within the expiration time to this API does not invoke ozoneManagerClient.getServerDefaults()
@@ -1107,6 +1108,11 @@ public short getDefaultReplication() { | |||
return adapter.getDefaultReplication(); | |||
} | |||
|
|||
@Override | |||
public OzoneFsServerDefaults getServerDefaults() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need tests at file system level for ofs and o3fs.
PR merged. Thank you, @SaketaChalamchala. Please proceed with the regression test in the follow-up task. |
* master: HDDS-11081. Use thread-local instance of FileSystem in Freon tests (#7091) HDDS-11333. Avoid hard-coded current version in upgrade/xcompat tests (#7089) Mark TestPipelineManagerMXBean#testPipelineInfo as flaky Mark TestAddRemoveOzoneManager#testForceBootstrap as flaky HDDS-11352. HDDS-11353. Mark TestOzoneManagerHAWithStoppedNodes as flaky HDDS-11354. Mark TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl as flaky HDDS-11355. Mark TestMultiBlockWritesWithDnFailures#testMultiBlockWritesWithIntermittentDnFailures as flaky HDDS-11227. Use server default key provider to encrypt/decrypt keys from multiple OMs. (#7081) HDDS-11316. Improve Create Key and Chunk IO Dashboards (#7075) HDDS-11239. Fix KeyOutputStream's exception handling when calling hsync concurrently (#7047)
…an-on-error * HDDS-10239-container-reconciliation: (428 commits) HDDS-11081. Use thread-local instance of FileSystem in Freon tests (apache#7091) HDDS-11333. Avoid hard-coded current version in upgrade/xcompat tests (apache#7089) Mark TestPipelineManagerMXBean#testPipelineInfo as flaky Mark TestAddRemoveOzoneManager#testForceBootstrap as flaky HDDS-11352. HDDS-11353. Mark TestOzoneManagerHAWithStoppedNodes as flaky HDDS-11354. Mark TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl as flaky HDDS-11355. Mark TestMultiBlockWritesWithDnFailures#testMultiBlockWritesWithIntermittentDnFailures as flaky HDDS-11227. Use server default key provider to encrypt/decrypt keys from multiple OMs. (apache#7081) HDDS-11316. Improve Create Key and Chunk IO Dashboards (apache#7075) HDDS-11239. Fix KeyOutputStream's exception handling when calling hsync concurrently (apache#7047) HDDS-11254. Reconcile commands should be handled by datanode ReplicationSupervisor (apache#7076) HDDS-11331. Fix Datanode unable to report for a long time (apache#7090) HDDS-11346. FS CLI gives incorrect recursive volume deletion prompt (apache#7102) HDDS-11349. Add NullPointer handling when volume/bucket tables are not initialized (apache#7103) HDDS-11209. Avoid insufficient EC pipelines in the container pipeline cache (apache#6974) HDDS-11284. refactor quota repair non-blocking while upgrade (apache#7035) HDDS-9790. Add tests for Overview page (apache#6983) HDDS-10904. [hsync] Enable PutBlock piggybacking and incremental chunk list by default (apache#7074) HDDS-11322. [hsync] Block ECKeyOutputStream from calling hsync and hflush (apache#7098) HDDS-11325. Intermittent failure in TestBlockOutputStreamWithFailures#testContainerClose (apache#7099) ... Conflicts: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerChecksumTreeManager.java hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerCheck.java hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java
…rrupt-files * HDDS-10239-container-reconciliation: (61 commits) HDDS-11081. Use thread-local instance of FileSystem in Freon tests (apache#7091) HDDS-11333. Avoid hard-coded current version in upgrade/xcompat tests (apache#7089) Mark TestPipelineManagerMXBean#testPipelineInfo as flaky Mark TestAddRemoveOzoneManager#testForceBootstrap as flaky HDDS-11352. HDDS-11353. Mark TestOzoneManagerHAWithStoppedNodes as flaky HDDS-11354. Mark TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl as flaky HDDS-11355. Mark TestMultiBlockWritesWithDnFailures#testMultiBlockWritesWithIntermittentDnFailures as flaky HDDS-11227. Use server default key provider to encrypt/decrypt keys from multiple OMs. (apache#7081) HDDS-11316. Improve Create Key and Chunk IO Dashboards (apache#7075) HDDS-11239. Fix KeyOutputStream's exception handling when calling hsync concurrently (apache#7047) HDDS-11254. Reconcile commands should be handled by datanode ReplicationSupervisor (apache#7076) HDDS-11331. Fix Datanode unable to report for a long time (apache#7090) HDDS-11346. FS CLI gives incorrect recursive volume deletion prompt (apache#7102) HDDS-11349. Add NullPointer handling when volume/bucket tables are not initialized (apache#7103) HDDS-11209. Avoid insufficient EC pipelines in the container pipeline cache (apache#6974) HDDS-11284. refactor quota repair non-blocking while upgrade (apache#7035) HDDS-9790. Add tests for Overview page (apache#6983) HDDS-10904. [hsync] Enable PutBlock piggybacking and incremental chunk list by default (apache#7074) HDDS-11322. [hsync] Block ECKeyOutputStream from calling hsync and hflush (apache#7098) HDDS-11325. Intermittent failure in TestBlockOutputStreamWithFailures#testContainerClose (apache#7099) ... Conflicts: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerChecksumTreeManager.java
<property> | ||
<name>ozone.client.server-defaults.validity.period.ms</name> | ||
<tag>OZONE, CLIENT, SECURITY</tag> | ||
<value>3600000</value> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we should add the default unit otherwise there will be an unnecessary log like the one below.
24/09/18 15:43:05 INFO Configuration.deprecation: No unit for ozone.client.server-defaults.validity.period.ms(3600000) assuming MILLISECONDS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And .ms
should be omitted from the property name.
What changes were proposed in this pull request?
Right now we cannot use a single client to read/write encrypted files to multiple Ozone clusters across trusted realms.
This is resolved in HDFS. HDFS overrides the implementation of hadoop.fs.FileSystem.getServerDefaults() where Namenode returns it's default keyProviderUri to the client. The client then uses Namenodes keyProvider instead of it's local keyProvider to correctly read/write encrypted files.
Proposed solution does the same in Ozone. OzoneManager returns it's server default configuration which includes it's keyProviderUri to the client. The client then uses the OM's keyProviderUri if available to get encryption keys before defaulting to it's local keyProvider. This enables the client to correctly encrypt/decrypt files from multiple Ozone clusters.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-11227
How was this patch tested?
Manual test.
Before the fix.
After the fix.
Created a separate JIRA for a docker test: https://issues.apache.org/jira/browse/HDDS-11332