Skip to content

Commit

Permalink
Generic shares_base module and specific s3_datasets_shares module - p…
Browse files Browse the repository at this point in the history
…art 7 (share_object_service) (#1340)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

The goal of this PR is to move the `share_object_service` to
`shares_base` and refactor any dependency to S3 in the service.

- Move file and fix imports of ShareObjectService
- Use DatasetsBase and DatasetsBaseRepository instead of the S3
equivalents
- ⚠️ Avoid Dashboard check logic in
`ShareObjectService.submit_share_object` see below
- ⚠️ Avoid SharePolicyService logic in
`ShareObjectService.create_share_object` see below
- Create ShareLogsService for logs
- Remove unused methods
- I also copied share_item_service to shares_base (it will be used in
next PR)


#### Avoid Dashboard check logic in
`ShareObjectService.submit_share_object`
Currently, whenever a share request is submitted, we check if the
REQUESTER environment has dashboards enabled and if there are shared
tables we verify that the Quicksight subscription is active.

Alternative: perform this check in the share processor of tables. It
solves the issue, but it gives a poorer user experience as it is
difficult to figure out for the requester why the share failed. This can
be solved holistically as requested in
#1168.

Decision: move the logic to the processor and make the table share fail.

#### Avoid SharePolicyService logic in
`ShareObjectService.create_share_object`

When a share request is first created, we perform a series of operations
to ensure that an IAM policy for the share requester principal IAM role
is created.

Alternative 1: move this logic inside the share processor. Not sure if
it is possible. It would be the ideal solution, but the
SharePolicyService throws errors in the create share object API if the
policy is not attached.

Alternative 2: implement interface to define share_policies (similar to
the dataset-actions that use share logic in
`backend/dataall/modules/s3_datasets_shares/services/dataset_sharing_service.py`).

Decision: we want to preserve the user experience of having the IAM
policy created before the share request is processed. Plus, it is not an
uncommon pattern that could get extended by other dataset types, for
example redshift sharing might need additional share policies. For this
reason this PR implements alternative 2


### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
  • Loading branch information
dlpzx authored Jun 19, 2024
1 parent 94b71c5 commit 3d2dfc7
Show file tree
Hide file tree
Showing 12 changed files with 467 additions and 145 deletions.
1 change: 0 additions & 1 deletion backend/dataall/base/aws/quicksight.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,6 @@ def check_quicksight_enterprise_subscription(AwsAccountId, region=None):

except client.exceptions.AccessDeniedException:
raise Exception('Access denied to Quicksight for selected role')
return False

@staticmethod
def create_quicksight_group(AwsAccountId, region, GroupName=DEFAULT_GROUP_NAME):
Expand Down
14 changes: 11 additions & 3 deletions backend/dataall/core/environment/services/managed_iam_policies.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,21 +26,29 @@ def policy_type(self):
"""
Returns string and needs to be implemented in the ManagedPolicies inherited classes
"""
raise NotImplementedError
...

@abstractmethod
def generate_policy_name(self) -> str:
"""
Returns string and needs to be implemented in the ManagedPolicies inherited classes
"""
raise NotImplementedError
...

@abstractmethod
def generate_empty_policy(self) -> dict:
"""
Returns dict and needs to be implemented in the ManagedPolicies inherited classes
"""
raise NotImplementedError
...

@abstractmethod
def create_managed_policy_from_inline_and_delete_inline(self) -> str:
"""
Returns policy arn and needs to be implemented in the ManagedPolicies inherited classes
It is used for backwards compatibility. It should be deprecated and removed in future releases.
"""
...

def check_if_policy_exists(self) -> bool:
policy_name = self.generate_policy_name()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ def list_all_active_share_objects(session) -> [ShareObject]: ## TODO: Already i
return session.query(ShareObject).filter(ShareObject.deleted.is_(None)).all()

@staticmethod
def find_share(session, dataset: S3Dataset, env, principal_id, group_uri) -> ShareObject:
def find_share(session, dataset: DatasetBase, env, principal_id, group_uri) -> ShareObject:
return (
session.query(ShareObject)
.filter(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
from typing import List
from warnings import warn
from datetime import datetime
from dataall.core.environment.services.environment_service import EnvironmentService
from dataall.base.aws.quicksight import QuicksightClient
from dataall.modules.shares_base.services.shares_enums import (
ShareItemHealthStatus,
ShareItemStatus,
Expand All @@ -13,7 +15,7 @@
from dataall.modules.shares_base.services.share_exceptions import PrincipalRoleNotFound
from dataall.modules.s3_datasets_shares.services.share_managers import LFShareManager
from dataall.modules.s3_datasets_shares.aws.ram_client import RamClient
from dataall.modules.s3_datasets_shares.services.share_object_service import ShareObjectService
from dataall.modules.shares_base.services.share_object_service import ShareObjectService
from dataall.modules.s3_datasets_shares.services.share_item_service import ShareItemService
from dataall.modules.s3_datasets_shares.db.share_object_repositories import ShareObjectRepository
from dataall.modules.shares_base.db.share_object_state_machines import ShareItemSM
Expand Down Expand Up @@ -78,6 +80,11 @@ def process_approved_shares(self) -> bool:
raise Exception(
'Source account details not initialized properly. Please check if the catalog account is properly onboarded on data.all'
)
env = EnvironmentService.get_environment_by_uri(self.session, self.share_data.share.environmentUri)
if EnvironmentService.get_boolean_env_param(self.session, env, 'dashboardsEnabled'):
QuicksightClient.check_quicksight_enterprise_subscription(
AwsAccountId=env.AwsAccountId, region=env.region
)
manager.initialize_clients()
manager.grant_pivot_role_all_database_permissions_to_source_database()
manager.check_if_exists_and_create_shared_database_in_target()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

from dataall.modules.shares_base.services.share_exceptions import PrincipalRoleNotFound
from dataall.modules.s3_datasets_shares.services.share_managers import S3AccessPointShareManager
from dataall.modules.s3_datasets_shares.services.share_object_service import ShareObjectService
from dataall.modules.shares_base.services.share_object_service import ShareObjectService
from dataall.modules.s3_datasets_shares.services.share_item_service import ShareItemService
from dataall.modules.shares_base.services.shares_enums import (
ShareItemHealthStatus,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

from dataall.modules.shares_base.services.share_exceptions import PrincipalRoleNotFound
from dataall.modules.s3_datasets_shares.services.share_managers import S3BucketShareManager
from dataall.modules.s3_datasets_shares.services.share_object_service import ShareObjectService
from dataall.modules.shares_base.services.share_object_service import ShareObjectService
from dataall.modules.shares_base.services.shares_enums import (
ShareItemHealthStatus,
ShareItemStatus,
Expand Down
8 changes: 4 additions & 4 deletions backend/dataall/modules/shares_base/api/resolvers.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
from dataall.modules.shares_base.services.shares_enums import ShareObjectPermission
from dataall.modules.shares_base.db.share_object_models import ShareObjectItem, ShareObject
from dataall.modules.s3_datasets_shares.services.share_item_service import ShareItemService
from dataall.modules.s3_datasets_shares.services.share_object_service import ShareObjectService
from dataall.modules.s3_datasets_shares.aws.glue_client import GlueClient
from dataall.modules.shares_base.services.share_object_service import ShareObjectService
from dataall.modules.shares_base.services.share_logs_service import ShareLogsService
from dataall.modules.s3_datasets.db.dataset_repositories import DatasetRepository
from dataall.modules.s3_datasets.db.dataset_models import DatasetStorageLocation, DatasetTable, S3Dataset

Expand Down Expand Up @@ -132,7 +132,7 @@ def get_share_object(context, source, shareUri: str = None):


def get_share_logs(context, source, shareUri: str):
return ShareObjectService.get_share_logs(shareUri)
return ShareLogsService.get_share_logs(shareUri)


def resolve_user_role(context: Context, source: ShareObject, **kwargs):
Expand Down Expand Up @@ -168,7 +168,7 @@ def resolve_user_role(context: Context, source: ShareObject, **kwargs):


def resolve_can_view_logs(context: Context, source: ShareObject):
return ShareObjectService.check_view_log_permissions(context.username, context.groups, source.shareUri)
return ShareLogsService.check_view_log_permissions(context.username, context.groups, source.shareUri)


def resolve_dataset(context: Context, source: ShareObject, **kwargs):
Expand Down
Loading

0 comments on commit 3d2dfc7

Please sign in to comment.