Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional checks for dataset importing #883

Merged
merged 18 commits into from
Dec 6, 2023

Conversation

nikpodsh
Copy link
Contributor

Feature or Bugfix

Feature

Detail

Added additional checks for dataset importing (see #614 )

Implemented checks that fail if:

  1. An importing dataset bucket is encrypted with AWS managed key, but KMS key provided
  2. An importing dataset bucket is encrypted with KMS key, but KMS key not provided
  3. An importing dataset bucket is encrypted with KMS key, but provided KMS key is wrong
  4. An imported dataset bucket is encrypted with AWS managed key, but user is trying to create a share requests for different account than dataset's account

From #614:

User can forget to grant the pivotRole access to modify the key policy -> All folder/bucket shares will fail when trying to update the key policy

The check for this is not implemented. Rationale for this:
There are two types of pivotRole: manual pivotRole (1) and cdk pivotRole (2).

  1. Manual pivotRole has kms:PutKeyPolicy for '*' resources in the environment. No action needed
  2. When dataset stack is deployed (for imported datasets), it triggers update of an environment stack. The environment stack updates policies for all imported keys and datasets. Eventually, it sets kms:PutKeyPolicy for the imported key that pivotRole can change the policy. Thus, the users can catch the error only if they try to create share before update of the stack is succeed

Relates

#614

Testing

Added unit test for checks in this PR. Added example of testing data.all aws clients, by mocking boto3 client. Usually, aws clients are mocked in the integration testing and this test can be used as an example how to mock boto3 instead.

Security

Please answer the questions below briefly where applicable, or write N/A. Based on
OWASP 10.

  • Are you introducing any new policies/roles/users? Yes
    • Adding s3:GetEncryptionConfiguration to the pivotRole to fetch the encryption configuration for env buckets. It's needed to implement additional checks. EncryptionConfiguration provides only information about encryption (encryption type and key) and doesn't allow pivotRole to read the content of the bucket.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@dlpzx
Copy link
Contributor

dlpzx commented Nov 30, 2023

We will pick up this task @nikpodsh

# Conflicts:
#	backend/dataall/modules/datasets/services/dataset_service.py
@dlpzx
Copy link
Contributor

dlpzx commented Dec 4, 2023

Testing locally:

  • Create Dataset
  • Import SSE-S3 Bucket and adding KMS Alias throws error
  • Import SSE-S3 Bucket without adding KMS Alias succeeds
  • Import KMS Bucket and without adding KMS Alias throws error An error occurred (RequiredParameter): KmsAlias is required
  • Import KMS Bucket adding KMS Alias that does not exist throws error An error occurred (AWSResourceNotFound) when calling IMPORT_DATASET operation: KMS key with alias=non-existing cannot be found - Please check if KMS Key Alias exists in account
  • Import KMS Bucket adding KMS Alias that exist succeeds ---> Error! An error occurred (InvalidInput): f'KmsAlias value imported-key must be Bucket imported-c-2-kms is encrypted with a KMS key different from the key with KmsAlias imported-key. Provide the correct KMS Alias as input parameter

@dlpzx
Copy link
Contributor

dlpzx commented Dec 4, 2023

Testing locally again:

  • Import SSE-S3 Bucket and adding KMS Alias throws error
  • Import SSE-S3 Bucket without adding KMS Alias succeeds
  • Import KMS Bucket and without adding KMS Alias throws error
  • Import KMS Bucket adding KMS Alias that does not exist throws error
  • Import KMS Bucket adding KMS Alias that exist succeeds
  • Dataset access button works for both datasets

@dlpzx dlpzx marked this pull request as ready for review December 4, 2023 09:50
@dlpzx
Copy link
Contributor

dlpzx commented Dec 4, 2023

Testing in AWS:

  • CICD pipeline succeeds
  • Pivot role without S3:GetBucketEncryption permissions throws error Data.all Environment Pivot Role does not have s3:GetEncryptionConfiguration Permission for xxxxx bucket: An error occurred (AccessDenied) when calling the GetBucketEncryption operation: Access Denied
  • Import SSE-S3 Bucket and adding KMS Alias throws error
  • Import SSE-S3 Bucket without adding KMS Alias succeeds
  • Import KMS Bucket and without adding KMS Alias throws error
  • Import KMS Bucket adding KMS Alias that does not exist throws error
  • Import KMS Bucket adding KMS Alias that exist succeeds
  • Dataset access button works for both datasets
  • Cross-account S3 Bucket sharing for SSE-S3 --> I created a bucket share and with the requester role I downloaded one file stored in the bucket @nikpodsh

@noah-paige
Copy link
Contributor

Tested a similar set of actions in an AWS Deployment with this PR:

  • CICD pipeline succeeds
  • Pivot role without S3:GetBucketEncryption permissions throws error Data.all Environment Pivot Role does not have s3:GetEncryptionConfiguration Permission for xxxxx bucket: An error occurred (AccessDenied) when calling the GetBucketEncryption operation: Access Denied
  • Import SSE-S3 Bucket and adding KMS Alias throws error
  • Import SSE-S3 Bucket without adding KMS Alias succeeds
  • Import KMS Bucket and without adding KMS Alias throws error
  • Import KMS Bucket adding KMS Alias that does not exist throws error
  • Import KMS Bucket adding KMS Alias but key policy restricts pivotRole permissions throws error
  • Import KMS Bucket adding KMS Alias that exist succeeds
  • Dataset access button works for both datasets

All looks good on my end

Copy link
Contributor

@noah-paige noah-paige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - approving

@dlpzx dlpzx merged commit e7c87df into data-dot-all:main Dec 6, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants