Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(datahub cli): DataHub CLI Quickstart #2689

Merged
merged 59 commits into from
Jun 15, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
eebc7f4
my new file
jjoyce0510 Apr 27, 2021
407b54f
Adding unified docker-compose quickstart file (cli)
jjoyce0510 Apr 27, 2021
8af14be
Naming and comments
jjoyce0510 Apr 27, 2021
bd87d2f
Pinning versions
jjoyce0510 Apr 27, 2021
9bfc689
start `datahub docker` and subcommands
hsheth2 Apr 27, 2021
c993034
Merge branch 'master' into DataHubCliQuickstart
hsheth2 May 7, 2021
2d5d026
Merge branch 'quickstart-cli' into DataHubCliQuickstart
hsheth2 May 7, 2021
64b23ce
drop build blocks
hsheth2 May 7, 2021
a75f4f3
add local + pull methods to docker quickstart
hsheth2 May 7, 2021
760886d
add better print messages
hsheth2 May 7, 2021
64786c8
Merge branch 'master' into DataHubCliQuickstart
hsheth2 May 7, 2021
2535a59
fix smoke test
hsheth2 May 7, 2021
32ed320
mem limits on some containers
hsheth2 May 7, 2021
f0ec428
more limits
hsheth2 May 8, 2021
5209483
add ingest sample data command
hsheth2 May 8, 2021
671fe76
bump gms mem
hsheth2 May 8, 2021
48b36ee
Reducing memory
May 8, 2021
f89f8bc
Merge branch 'DataHubCliQuickstart' of https://github.com/acryldata/d…
May 8, 2021
4188353
Merge branch 'master' into DataHubCliQuickstart
hsheth2 May 18, 2021
4d71943
update path option for quickstart
hsheth2 May 18, 2021
4daa1bb
Merge branch 'master' into DataHubCliQuickstart
hsheth2 Jun 11, 2021
879ef1a
smoke test
hsheth2 Jun 11, 2021
487ebf0
Adding generate_and_compare.sh to CI pipeline
jjoyce0510 Jun 11, 2021
8122db7
Merge branch 'DataHubCliQuickstart' of https://github.com/acryldata/d…
jjoyce0510 Jun 11, 2021
c599db0
udpate path accordingly
hsheth2 Jun 11, 2021
4a74359
remove retry logic from smoketest e2e
hsheth2 Jun 11, 2021
b6d119d
Added nuke
Jun 11, 2021
ecc2de2
Merge branch 'DataHubCliQuickstart' of https://github.com/acryldata/d…
Jun 11, 2021
62e6fda
add DOCKER_BUILDKIT
hsheth2 Jun 11, 2021
0ec41e4
remove unused vars
hsheth2 Jun 11, 2021
e71658d
Fixed network nuke
Jun 11, 2021
ee9de12
Merge branch 'DataHubCliQuickstart' of https://github.com/acryldata/d…
Jun 11, 2021
4aba3ee
revert setup.py
Jun 11, 2021
78e7f70
Adding Quickstart Guide
jjoyce0510 Jun 11, 2021
e6cd68f
Merge branch 'DataHubCliQuickstart' of github.com:acryldata/datahub-f…
hsheth2 Jun 11, 2021
7d31d56
Merge branch 'master' into DataHubCliQuickstart
hsheth2 Jun 11, 2021
28f9a0c
cli doc improvements
hsheth2 Jun 11, 2021
19ff935
Update to head and add version param
Jun 11, 2021
23e9a93
Merge branch 'DataHubCliQuickstart' of https://github.com/acryldata/d…
Jun 11, 2021
c1fbca4
rename --dev to --build-locally
hsheth2 Jun 11, 2021
79b307e
Merge branch 'DataHubCliQuickstart' of github.com:acryldata/datahub-f…
hsheth2 Jun 11, 2021
dc5b65d
Adding mysql setup
jjoyce0510 Jun 11, 2021
6a8dd60
Merge branch 'DataHubCliQuickstart' of https://github.com/acryldata/d…
jjoyce0510 Jun 11, 2021
a9b5858
Adding correct container
jjoyce0510 Jun 11, 2021
3823636
Quickstart docker compose
jjoyce0510 Jun 11, 2021
8542438
Fixing Python gen script
jjoyce0510 Jun 12, 2021
80e1584
Added
Jun 12, 2021
5379628
Merge branch 'DataHubCliQuickstart' of https://github.com/acryldata/d…
Jun 12, 2021
cdfe283
better code style in generation script
hsheth2 Jun 12, 2021
d594751
Merge branch 'DataHubCliQuickstart' of github.com:acryldata/datahub-f…
hsheth2 Jun 12, 2021
3e47fac
Updating generated docker compose
jjoyce0510 Jun 12, 2021
89d96fd
Merge branch 'DataHubCliQuickstart' of https://github.com/acryldata/d…
jjoyce0510 Jun 12, 2021
f36c4f1
check mysql-setup if present
hsheth2 Jun 12, 2021
610700e
Merge branch 'DataHubCliQuickstart' of github.com:acryldata/datahub-f…
hsheth2 Jun 12, 2021
1f59fa7
Merge remote-tracking branch 'acryl/DataHubCliQuickstart' into CliQui…
jjoyce0510 Jun 14, 2021
66dbc5f
Removing unnecessary file
jjoyce0510 Jun 14, 2021
aa9998c
Merge remote-tracking branch 'acryl/DataHubCliQuickstart' into CliQui…
jjoyce0510 Jun 14, 2021
c6309d3
Setup.cfg update
jjoyce0510 Jun 14, 2021
d05d294
Merge remote-tracking branch 'acryl/DataHubCliQuickstart' into CliQui…
jjoyce0510 Jun 14, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions .github/workflows/build-and-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,14 +65,21 @@ jobs:
- name: Gradle build
run: ./gradlew build -x check -x docs-website:build
- name: Smoke test
run: |
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hsheth2 was this intentional?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - smoke.sh now runs the up command within the script

./docker/dev.sh -d
sleep 30
./smoke-test/smoke.sh
run: ./smoke-test/smoke.sh
- name: Slack failure notification
if: failure() && github.event_name == 'push'
uses: kpritam/slack-job-status-action@v1
with:
job-status: ${{ job.status }}
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
channel: github-activities

quickstart-compose-validation:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: "3.6"
- name: Quickstart Compose Validation
run: ./docker/quickstart/generate_and_compare.sh
11 changes: 11 additions & 0 deletions docker/docker-compose.override.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,17 @@ services:
- ./mysql/init.sql:/docker-entrypoint-initdb.d/init.sql
- mysqldata:/var/lib/mysql

mysql-setup:
build:
context: ../
dockerfile: docker/mysql-setup/Dockerfile
image: acryldata/datahub-mysql-setup:head
env_file: mysql-setup/env/docker.env
hostname: mysql-setup
container_name: mysql-setup
depends_on:
- mysql

datahub-gms:
env_file: datahub-gms/env/docker.env
depends_on:
Expand Down
2 changes: 1 addition & 1 deletion docker/elasticsearch/env/docker.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
discovery.type=single-node
xpack.security.enabled=false
ES_JAVA_OPTS=-Xms1g -Xmx1g
ES_JAVA_OPTS=-Xms512m -Xmx512m
5 changes: 5 additions & 0 deletions docker/mysql-setup/env/docker.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
MYSQL_HOST=mysql
MYSQL_PORT=3306
MYSQL_USERNAME=datahub
MYSQL_PASSWORD=datahub
DATAHUB_DB_NAME=datahub
208 changes: 208 additions & 0 deletions docker/quickstart/docker-compose.quickstart.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
networks:
default:
name: datahub_network
services:
broker:
container_name: broker
depends_on:
- zookeeper
environment:
- KAFKA_BROKER_ID=1
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
- KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
- KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1
- KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS=0
hostname: broker
image: confluentinc/cp-kafka:5.4.0
ports:
- 29092:29092
- 9092:9092
datahub-frontend-react:
container_name: datahub-frontend-react
depends_on:
- datahub-gms
environment:
- DATAHUB_GMS_HOST=datahub-gms
- DATAHUB_GMS_PORT=8080
- DATAHUB_SECRET=YouKnowNothing
- DATAHUB_APP_VERSION=1.0
- DATAHUB_PLAY_MEM_BUFFER_SIZE=10MB
- KAFKA_BOOTSTRAP_SERVER=broker:29092
- DATAHUB_TRACKING_TOPIC=DataHubUsageEvent_v1
- ELASTIC_CLIENT_HOST=elasticsearch
- ELASTIC_CLIENT_PORT=9200
hostname: datahub-frontend-react
image: linkedin/datahub-frontend-react:${DATAHUB_VERSION:-latest}
ports:
- 9002:9002
datahub-gms:
container_name: datahub-gms
depends_on:
- mysql
environment:
- DATASET_ENABLE_SCSI=false
- EBEAN_DATASOURCE_USERNAME=datahub
- EBEAN_DATASOURCE_PASSWORD=datahub
- EBEAN_DATASOURCE_HOST=mysql:3306
- EBEAN_DATASOURCE_URL=jdbc:mysql://mysql:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8
- EBEAN_DATASOURCE_DRIVER=com.mysql.jdbc.Driver
- KAFKA_BOOTSTRAP_SERVER=broker:29092
- KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
- ELASTICSEARCH_HOST=elasticsearch
- ELASTICSEARCH_PORT=9200
- NEO4J_HOST=http://neo4j:7474
- NEO4J_URI=bolt://neo4j
- NEO4J_USERNAME=neo4j
- NEO4J_PASSWORD=datahub
hostname: datahub-gms
image: linkedin/datahub-gms:${DATAHUB_VERSION:-latest}
mem_limit: 850m
ports:
- 8080:8080
datahub-mae-consumer:
container_name: datahub-mae-consumer
depends_on:
- kafka-setup
- elasticsearch-setup
- neo4j
environment:
- KAFKA_BOOTSTRAP_SERVER=broker:29092
- KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
- ELASTICSEARCH_HOST=elasticsearch
- ELASTICSEARCH_PORT=9200
- NEO4J_HOST=http://neo4j:7474
- NEO4J_URI=bolt://neo4j
- NEO4J_USERNAME=neo4j
- NEO4J_PASSWORD=datahub
- GMS_HOST=datahub-gms
- GMS_PORT=8080
hostname: datahub-mae-consumer
image: linkedin/datahub-mae-consumer:${DATAHUB_VERSION:-latest}
mem_limit: 256m
ports:
- 9091:9091
datahub-mce-consumer:
container_name: datahub-mce-consumer
depends_on:
- kafka-setup
- datahub-gms
environment:
- KAFKA_BOOTSTRAP_SERVER=broker:29092
- KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
- GMS_HOST=datahub-gms
- GMS_PORT=8080
hostname: datahub-mce-consumer
image: linkedin/datahub-mce-consumer:${DATAHUB_VERSION:-latest}
mem_limit: 384m
ports:
- 9090:9090
elasticsearch:
container_name: elasticsearch
environment:
- discovery.type=single-node
- xpack.security.enabled=false
- ES_JAVA_OPTS=-Xms512m -Xmx512m
healthcheck:
retries: 4
start_period: 2m
test:
- CMD-SHELL
- curl -sS --fail 'http://localhost:9200/_cluster/health?wait_for_status=yellow&timeout=0s'
|| exit 1
hostname: elasticsearch
image: elasticsearch:7.9.3
mem_limit: 1g
ports:
- 9200:9200
volumes:
- esdata:/usr/share/elasticsearch/data
elasticsearch-setup:
container_name: elasticsearch-setup
depends_on:
- elasticsearch
environment:
- ELASTICSEARCH_HOST=elasticsearch
- ELASTICSEARCH_PORT=9200
- ELASTICSEARCH_PROTOCOL=http
hostname: elasticsearch-setup
image: linkedin/datahub-elasticsearch-setup:${DATAHUB_VERSION:-latest}
kafka-setup:
container_name: kafka-setup
depends_on:
- broker
- schema-registry
environment:
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_BOOTSTRAP_SERVER=broker:29092
hostname: kafka-setup
image: linkedin/datahub-kafka-setup:${DATAHUB_VERSION:-latest}
mysql:
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
container_name: mysql
environment:
- MYSQL_DATABASE=datahub
- MYSQL_USER=datahub
- MYSQL_PASSWORD=datahub
- MYSQL_ROOT_PASSWORD=datahub
hostname: mysql
image: mysql:5.7
ports:
- 3306:3306
volumes:
- ./mysql/init.sql:/docker-entrypoint-initdb.d/init.sql
- mysqldata:/var/lib/mysql
mysql-setup:
container_name: mysql-setup
depends_on:
- mysql
environment:
- MYSQL_HOST=mysql
- MYSQL_PORT=3306
- MYSQL_USERNAME=datahub
- MYSQL_PASSWORD=datahub
- DATAHUB_DB_NAME=datahub
hostname: mysql-setup
image: acryldata/datahub-mysql-setup:head
neo4j:
container_name: neo4j
environment:
- NEO4J_AUTH=neo4j/datahub
- NEO4J_dbms_default__database=graph.db
- NEO4J_dbms_allow__upgrade=true
hostname: neo4j
image: neo4j:4.0.6
ports:
- 7474:7474
- 7687:7687
volumes:
- neo4jdata:/data
schema-registry:
container_name: schema-registry
depends_on:
- zookeeper
- broker
environment:
- SCHEMA_REGISTRY_HOST_NAME=schemaregistry
- SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper:2181
hostname: schema-registry
image: confluentinc/cp-schema-registry:5.4.0
ports:
- 8081:8081
zookeeper:
container_name: zookeeper
environment:
- ZOOKEEPER_CLIENT_PORT=2181
- ZOOKEEPER_TICK_TIME=2000
hostname: zookeeper
image: confluentinc/cp-zookeeper:5.4.0
ports:
- 2181:2181
volumes:
- zkdata:/var/opt/zookeeper
version: '2'
volumes:
esdata: null
mysqldata: null
neo4jdata: null
zkdata: null
20 changes: 20 additions & 0 deletions docker/quickstart/generate_and_compare.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/bin/bash

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
cd "$DIR"

set -euxo pipefail

python3 -m venv venv
source venv/bin/activate

pip install -r requirements.txt
python generate_docker_quickstart.py ../docker-compose.yml ../docker-compose.override.yml temp.quickstart.yml

if cmp docker-compose.quickstart.yml temp.quickstart.yml; then
printf 'docker-compose.quickstart.yml is up to date.'
exit 0
else
printf 'docker-compose.quickstart.yml is out of date.'
exit 1
fi
Loading