Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update modularization work with the latest main #661

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
171a75c
Bump webpack from 5.75.0 to 5.76.1 in /frontend (#371)
dependabot[bot] Mar 15, 2023
4002963
Upgrade sqlalchemy 13.16 -> 1.3.24 and starlette 0.19.1 -> 0.25.0, ar…
dlpzx Mar 20, 2023
79f0e4c
Add dependency in dataset stack (#385)
dlpzx Mar 27, 2023
95c8619
feat: generate url with dynamically domain name for quicksight embede…
wolanlu Mar 28, 2023
f677371
fix: dev docker images base (#387)
AmrSaber Mar 28, 2023
e01e014
Added missing groupUri from get credentials (#391)
dlpzx Mar 30, 2023
9057116
388 race condition occurs when adding folder to shared items in share…
dlpzx Mar 30, 2023
d191b26
hotfix: Revert PR on custom url quicksight embedding sessions (#403)
dlpzx Apr 4, 2023
a17f12a
401 shared dbs worksheet list (#402)
noah-paige Apr 4, 2023
6460986
Fix sharing update (#404)
dlpzx Apr 4, 2023
219553f
V1.5.0 Features (#409)
dlpzx Apr 25, 2023
7ad099f
Bump flask from 2.0.3 to 2.3.2 in /backend (#439)
dependabot[bot] May 2, 2023
e4b3e73
Bump flask from 2.0.3 to 2.3.2 in /backend/dataall/cdkproxy (#438)
dependabot[bot] May 2, 2023
2a319ca
solve deployment bug #433 CloudFront logs does not enable ACL access …
akaitoua May 5, 2023
13a2fc0
Modify docker-compose yaml to read region and default region from env…
dlpzx May 10, 2023
e9ebb08
Bump pymdown-extensions from 8.1.1 to 10.0 in /documentation/userguid…
dependabot[bot] May 16, 2023
4ad8ce7
Bump starlette from 0.25.0 to 0.27.0 and upgrade fastapi (#460)
dlpzx May 17, 2023
ee4f34c
Fixes issue with existing cognito callbacks (#464)
gmuslia May 17, 2023
3b85ad2
Fix lambda/ECS IAM permissions for AOSS (#467)
kukushking May 22, 2023
3097a3a
465 - Update Aurora default Parameter Group to 'default.aurora-postgr…
rbernotas May 22, 2023
fd7da75
Bump requests from 2.27.1 to 2.31.0 in /backend (#469)
dependabot[bot] May 23, 2023
65ea17b
Bump requests from 2.27.1 to 2.31.0 in /backend/dataall/cdkproxy (#470)
dependabot[bot] May 23, 2023
b59cf9e
hotfix: Remove GitHub template option from data.all Pipelines (#472)
dlpzx May 23, 2023
3340610
fix: Upgrade aurora engine version to 11.16 (#471)
kimengu-david May 24, 2023
b441258
Updated CDK Version to fix issue with cdkproxy/ dataset stack creatio…
gmuslia May 24, 2023
d0ea832
update auth-at-edge semantic version to latest 2.1.5 (#480)
dlpzx May 25, 2023
e9c64d7
fix: Fix typo that destroys storage locations (#481)
dlpzx May 26, 2023
9fc84bf
Update CDK Version to v2.77.0 to fix issue with CDK Pipeline role (#484)
gmuslia May 30, 2023
fa45abd
fix: safe removal of consumption roles with open share requests (#485)
dlpzx Jun 1, 2023
6880237
fix: dynamic sql generation (#514)
chamcca Jun 13, 2023
9efb234
dependabot - upgradefast-xml-parser, aws-amplify, react-scripts, over…
dlpzx Jun 16, 2023
aa9d3df
dependabot: resolve nth-check in sub-dependencies (#525)
dlpzx Jun 19, 2023
dfbee81
Update import dataset documentation (#546)
marjet26 Jul 3, 2023
a3a6bde
feat: Limiting read-only access to root file systems in ECS (#523)
dbalintx Jul 4, 2023
261f086
Bump tough-cookie from 4.1.2 to 4.1.3 in /frontend (#558)
dependabot[bot] Jul 10, 2023
558e2bc
Resolve dataset share checks when deleting dataset (#554)
noah-paige Jul 10, 2023
49ffa72
optimized docker image size (#549)
srinivasreddych Jul 11, 2023
4c594c2
Added ec2:DescribePrefix permissions to CDKSynth (#566)
dlpzx Jul 12, 2023
45c5cfb
Bump semver from 5.7.1 to 5.7.2 in /frontend (#564)
dependabot[bot] Jul 12, 2023
84c555e
V1.6.0 features (#565)
dlpzx Jul 19, 2023
f3baf14
Fix wrong update of externalId for pivotRole (#591)
dlpzx Jul 25, 2023
476ecea
Fix cloudfront stack in case custom domain is given (#607)
dbalintx Aug 2, 2023
7597a92
first commit
dlpzx Aug 2, 2023
9bf015f
first commit
dlpzx Aug 2, 2023
9fb4f70
Add missing KMS keys for canaries (#619)
dlpzx Aug 3, 2023
c678e67
Allow restricted nacls backend VPC (#626)
noah-paige Aug 4, 2023
8900ebf
resolve unnecessary dependency in git_release role (#623)
dlpzx Aug 8, 2023
f0a932f
get prefix list ids for dbmigration for infra region (#624)
dlpzx Aug 8, 2023
f235c19
Handle External ID SSM v1.6.1> (#630)
noah-paige Aug 8, 2023
63137ac
Refine CDK Custom Exec Policy - Linking Envs (#648)
noah-paige Aug 10, 2023
a39fd43
Resolve Dataset Profiling Glue Job (#649)
noah-paige Aug 10, 2023
c189de4
Fix migration script for v1.2 upgrade (#651)
dlpzx Aug 14, 2023
165664b
Merge remote-tracking branch 'upstream/main' into merge-main-into-mod…
nikpodsh Aug 16, 2023
318c817
Manual fixing of merge conflicts
nikpodsh Aug 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,13 @@ def on_create(event):
except ClientError as e:
pass

default_db_exists = False
try:
glue_client.get_database(Name="default")
default_db_exists = True
except ClientError as e:
pass

if not exists:
try:
db_input = props.get('DatabaseInput').copy()
Expand All @@ -63,7 +70,7 @@ def on_create(event):
raise Exception(f"Could not create Glue Database {props['DatabaseInput']['Name']} in aws://{AWS_ACCOUNT}/{AWS_REGION}, received {str(e)}")

Entries = []
for i, role_arn in enumerate(props.get('DatabaseAdministrators')):
for i, role_arn in enumerate(props.get('DatabaseAdministrators', [])):
Entries.append(
{
'Id': str(uuid.uuid4()),
Expand Down Expand Up @@ -103,6 +110,20 @@ def on_create(event):
'PermissionsWithGrantOption': ['SELECT', 'ALTER', 'DESCRIBE'],
}
)
if default_db_exists:
Entries.append(
{
'Id': str(uuid.uuid4()),
'Principal': {'DataLakePrincipalIdentifier': role_arn},
'Resource': {
'Database': {
'Name': 'default'
}
},
'Permissions': ['Describe'.upper()],
}
)

lf_client.batch_grant_permissions(CatalogId=props['CatalogId'], Entries=Entries)
physical_id = props['DatabaseInput']['Imported'] + props['DatabaseInput']['Name']

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import json
import os
import logging
import pprint
import sys
Expand All @@ -8,7 +9,6 @@
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from pydeequ.profiles import *

sc = SparkContext.getOrCreate()
sc._jsc.hadoopConfiguration().set('fs.s3.canned.acl', 'BucketOwnerFullControl')
Expand All @@ -32,6 +32,7 @@
'environmentBucket',
'dataallRegion',
'table',
"SPARK_VERSION"
]
try:
args = getResolvedOptions(sys.argv, list_args)
Expand All @@ -43,6 +44,10 @@
list_args.remove('table')
args = getResolvedOptions(sys.argv, list_args)

os.environ["SPARK_VERSION"] = args.get("SPARK_VERSION", "3.1")

from pydeequ.profiles import *

logger.info('Parsed Retrieved parameters')

logger.info('Parsed Args = %s', pprint.pformat(args))
Expand Down
10 changes: 7 additions & 3 deletions backend/dataall/modules/datasets/cdk/dataset_stack.py
Original file line number Diff line number Diff line change
Expand Up @@ -300,24 +300,26 @@ def __init__(self, scope, id, target_uri: str = None, **kwargs):
]
),
iam.PolicyStatement(
sid="CreateLoggingGlueCrawler",
sid="CreateLoggingGlue",
actions=[
'logs:CreateLogGroup',
'logs:CreateLogStream',
],
effect=iam.Effect.ALLOW,
resources=[
f'arn:aws:logs:{dataset.region}:{dataset.AwsAccountId}:log-group:/aws-glue/crawlers*',
f'arn:aws:logs:{dataset.region}:{dataset.AwsAccountId}:log-group:/aws-glue/jobs/*',
],
),
iam.PolicyStatement(
sid="LoggingGlueCrawler",
sid="LoggingGlue",
actions=[
'logs:PutLogEvents',
],
effect=iam.Effect.ALLOW,
resources=[
f'arn:aws:logs:{dataset.region}:{dataset.AwsAccountId}:log-group:/aws-glue/crawlers:log-stream:{dataset.GlueCrawlerName}',
f'arn:aws:logs:{dataset.region}:{dataset.AwsAccountId}:log-group:/aws-glue/jobs/*',
],
),
iam.PolicyStatement(
Expand Down Expand Up @@ -443,7 +445,8 @@ def __init__(self, scope, id, target_uri: str = None, **kwargs):
'CreateTableDefaultPermissions': [],
'Imported': 'IMPORTED-' if dataset.imported else 'CREATED-'
},
'DatabaseAdministrators': dataset_admins
'DatabaseAdministrators': dataset_admins,
'TriggerUpdate': True
},
)

Expand Down Expand Up @@ -484,6 +487,7 @@ def __init__(self, scope, id, target_uri: str = None, **kwargs):
'--enable-metrics': 'true',
'--enable-continuous-cloudwatch-log': 'true',
'--enable-glue-datacatalog': 'true',
'--SPARK_VERSION': '3.1',
}

job = glue.CfnJob(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,61 @@
"""
from alembic import op
import sqlalchemy as sa
from sqlalchemy import orm, Column, String
from sqlalchemy.dialects import postgresql
from sqlalchemy.ext.declarative import declarative_base

# revision identifiers, used by Alembic.
revision = 'b1cdc0dc987a'
down_revision = '4392a0c9747f'
branch_labels = None
depends_on = None

Base = declarative_base()


class DataPipeline(Base):
__tablename__ = 'datapipeline'
DataPipelineUri = Column(
String, nullable=False, primary_key=True
)
devStrategy = Column(String, nullable=True)
devStages = Column(postgresql.ARRAY(String), nullable=True)


def upgrade():
# ### commands auto generated by Alembic - please adjust! ###
# Modify column types
print("Upgrade devStages and devStrategy column types. Updating nullable to True...")
op.add_column(
'datapipeline',
sa.Column('template', sa.String(), nullable=True)
)
op.alter_column(
'datapipeline',
'devStages',
existing_type=postgresql.ARRAY(sa.VARCHAR()),
nullable=True
)
op.alter_column(
'datapipeline',
'devStrategy',
existing_type=sa.VARCHAR(),
nullable=True
)
print("Backfilling values for devStages and devStrategy...")
# Backfill values
bind = op.get_bind()
session = orm.Session(bind=bind)
session.query(DataPipeline).filter(DataPipeline.devStrategy is None).update(
{DataPipeline.devStrategy: 'gitflowBlueprint'}, synchronize_session=False)

session.query(DataPipeline).filter(DataPipeline.devStages is None).update(
{DataPipeline.devStages: ['dev', 'test', 'prod']}, synchronize_session=False)
session.commit()

print("Backfilling values for devStages and devStrategy is done. Updating nullable to False...")
# Force nullable = False
op.alter_column(
'datapipeline',
'devStages',
Expand Down
37 changes: 17 additions & 20 deletions deploy/cdk_exec_policy/cdkExecPolicy.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
AWSTemplateFormatVersion: 2010-09-09
Description: Custom least privilege IAM policy for linking environments to dataall
Parameters:
AwsAccountId:
Description: AWS AccountId of the account that we wish to link.
Type: String
PolicyName:
Description: IAM policy name (The same name must be used during CDK bootstrapping. Default is DataAllCustomCDKPolicy.)
Type: String
Expand Down Expand Up @@ -48,14 +45,14 @@ Resources:
Effect: Allow
Action: 'athena:CreateWorkGroup'
Resource:
- !Sub 'arn:aws:athena:*:${AWS::AccountId}:workgroup/*'
- !Sub 'arn:${AWS::Partition}:athena:*:${AWS::AccountId}:workgroup/*'
- Sid: IAM
Action:
- 'iam:CreatePolicy'
- 'iam:GetPolicy'
Effect: Allow
Resource:
- !Sub 'arn:aws:iam::${AWS::AccountId}:policy/*'
- !Sub 'arn:${AWS::Partition}:iam::${AWS::AccountId}:policy/*'
- Sid: IAMRole
Action:
- 'iam:AttachRolePolicy'
Expand All @@ -82,7 +79,7 @@ Resources:
- 'iam:CreatePolicyVersion'
- 'iam:DeletePolicyVersion'
Resource:
- !Sub 'arn:aws:iam::${AWS::AccountId}:policy/service-role/AWSQuickSight*'
- !Sub 'arn:${AWS::Partition}:iam::${AWS::AccountId}:policy/service-role/AWSQuickSight*'
- Sid: QuickSight
Effect: Allow
Action:
Expand Down Expand Up @@ -114,14 +111,14 @@ Resources:
- 'kms:CreateAlias'
Effect: Allow
Resource:
- !Sub 'arn:aws:kms:*:${AWS::AccountId}:alias/*'
- !Sub 'arn:${AWS::Partition}:kms:*:${AWS::AccountId}:alias/*'
- Sid: KMSKey
Action:
- 's3:PutBucketAcl'
- 's3:PutBucketNotification'
Effect: Allow
Resource:
- !Sub 'arn:aws:s3:::${EnvironmentResourcePrefix}-logging-*'
- !Sub 'arn:${AWS::Partition}:s3:::${EnvironmentResourcePrefix}-logging-*'
- Sid: ReadBuckets
Action:
- 'kms:CreateAlias'
Expand All @@ -136,7 +133,7 @@ Resources:
- 'kms:PutKeyPolicy'
- 'kms:TagResource'
Effect: Allow
Resource: !Sub 'arn:aws:kms:*:${AWS::AccountId}:key/*'
Resource: !Sub 'arn:${AWS::Partition}:kms:*:${AWS::AccountId}:key/*'
- Sid: Lambda
Action:
- 'lambda:AddPermission'
Expand All @@ -154,7 +151,7 @@ Resources:
Action:
- 'lambda:PublishLayerVersion'
Resource:
- !Sub 'arn:aws:lambda:*:${AWS::AccountId}:layer:*'
- !Sub 'arn:${AWS::Partition}:lambda:*:${AWS::AccountId}:layer:*'
- Sid: S3
Action:
- 's3:CreateBucket'
Expand All @@ -170,13 +167,13 @@ Resources:
- 's3:DeleteBucketPolicy'
- 's3:DeleteBucket'
Effect: Allow
Resource: 'arn:aws:s3:::*'
Resource: !Sub 'arn:${AWS::Partition}:s3:::*'
- Sid: SQS
Effect: Allow
Action:
- 'sqs:CreateQueue'
- 'sqs:SetQueueAttributes'
Resource: !Sub 'arn:aws:sqs:*:${AWS::AccountId}:*'
Resource: !Sub 'arn:${AWS::Partition}:sqs:*:${AWS::AccountId}:*'
- Sid: SSM
Effect: Allow
Action:
Expand All @@ -190,18 +187,18 @@ Resources:
- 'logs:CreateLogStream'
- 'logs:PutLogEvents'
- 'logs:DescribeLogStreams'
Resource: 'arn:aws:logs:*:*:*'
Resource: !Sub 'arn:${AWS::Partition}:logs:*:*:*'
- Sid: STS
Effect: Allow
Action:
- 'sts:AssumeRole'
- 'iam:*Role*'
Resource: !Sub 'arn:aws:iam::${AWS::AccountId}:role/cdk-*'
Resource: !Sub 'arn:${AWS::Partition}:iam::${AWS::AccountId}:role/cdk-*'
- Sid: CloudFormation
Effect: Allow
Action:
- 'cloudformation:*'
Resource: !Sub 'arn:aws:cloudformation:*:${AWS::AccountId}:stack/CDKToolkit/*'
Resource: !Sub 'arn:${AWS::Partition}:cloudformation:*:${AWS::AccountId}:stack/CDKToolkit/*'
- Sid: ECR
Effect: Allow
Action:
Expand All @@ -211,14 +208,14 @@ Resources:
- 'ecr:DescribeRepositories'
- 'ecr:CreateRepository'
- 'ecr:DeleteRepository'
Resource: !Sub 'arn:aws:ecr:*:${AWS::AccountId}:repository/cdk-*'
Resource: !Sub 'arn:${AWS::Partition}:ecr:*:${AWS::AccountId}:repository/cdk-*'
- Sid: SSMTwo
Effect: Allow
Action:
- 'ssm:GetParameter'
- 'ssm:PutParameter'
- 'ssm:DeleteParameter'
Resource: !Sub 'arn:aws:ssm:*:${AWS::AccountId}:parameter/cdk-bootstrap/*'
Resource: !Sub 'arn:${AWS::Partition}:ssm:*:${AWS::AccountId}:parameter/cdk-bootstrap/*'
- Sid: CloudformationTwo
Effect: Allow
Action:
Expand All @@ -232,7 +229,7 @@ Resources:
Action:
- 's3:*'
Resource:
- !Sub 'arn:aws:s3:::cdktoolkit-stagingbucket-*'
- !Sub 'arn:${AWS::Partition}:s3:::cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}*'
- Sid: Pipelines
Effect: Allow
Action:
Expand Down Expand Up @@ -261,15 +258,15 @@ Resources:
- 's3:ListBucket'
- 's3:GetBucketPolicy'
Resource:
- 'arn:aws:s3::*:codepipeline-*'
- !Sub 'arn:${AWS::Partition}:s3::*:codepipeline-*'
- Sid: CodeStarNotificationsReadOnly
Effect: Allow
Action:
- 'codestar-notifications:DescribeNotificationRule'
Resource: '*'
Condition:
'StringLike':
'codestar-notifications:NotificationsForResource': 'arn:aws:codepipeline:*'
'codestar-notifications:NotificationsForResource': !Sub 'arn:${AWS::Partition}:codepipeline:*'
- Sid: Eventrules
Effect: Allow
Action:
Expand Down
2 changes: 2 additions & 0 deletions deploy/stacks/backend_stack.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ def __init__(
image_tag=None,
pipeline_bucket=None,
vpc_id=None,
vpc_restricted_nacls=False,
vpc_endpoints_sg=None,
internet_facing=True,
custom_domain=None,
Expand Down Expand Up @@ -64,6 +65,7 @@ def __init__(
resource_prefix=resource_prefix,
vpc_endpoints_sg=vpc_endpoints_sg,
vpc_id=vpc_id,
restricted_nacl=vpc_restricted_nacls,
**kwargs,
)
vpc = self.vpc_stack.vpc
Expand Down
2 changes: 2 additions & 0 deletions deploy/stacks/backend_stage.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ def __init__(
tooling_account_id=None,
pipeline_bucket=None,
vpc_id=None,
vpc_restricted_nacls=False,
vpc_endpoints_sg=None,
internet_facing=True,
custom_domain=None,
Expand Down Expand Up @@ -45,6 +46,7 @@ def __init__(
pipeline_bucket=pipeline_bucket,
image_tag=commit_id,
vpc_id=vpc_id,
vpc_restricted_nacls=vpc_restricted_nacls,
vpc_endpoints_sg=vpc_endpoints_sg,
internet_facing=internet_facing,
custom_domain=custom_domain,
Expand Down
Loading