Skip to content

Commit

Permalink
refactor: use named volume instead of bind mount in quickstart
Browse files Browse the repository at this point in the history
Volume is the preferred method over bind mount (https://docs.docker.com/storage/volumes/) for persistent container data.
This also eliminates the need for the ugly chmod hack for elasticsearch and hopefully fixes #1650
  • Loading branch information
mars-lan committed May 8, 2020
1 parent c43e119 commit c515cb7
Show file tree
Hide file tree
Showing 5 changed files with 17 additions and 19 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ This repository contains the complete source code for both DataHub's frontend &
3. Clone this repo and `cd` into the root directory of the cloned repository.
4. Run the following command to download and run all Docker containers locally:
```
cd docker/quickstart && source ./quickstart.sh
./docker/quickstart/quickstart.sh
```
This step takes a while to run the first time, and it may be difficult to tell if DataHub is fully up and running from the combined log. Please use [this guide](https://github.com/linkedin/datahub/blob/master/docs/debugging.md#how-can-i-confirm-if-all-docker-containers-are-running-as-expected-after-a-quickstart) to verify that each container is running correctly.
5. At this point, you should be able to start DataHub by opening [http://localhost:9001](http://localhost:9001) in your browser. You can sign in using `datahub` as both username and password. However, you'll notice that no data has been ingested yet.
Expand Down
6 changes: 2 additions & 4 deletions docker/quickstart/README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
# DataHub Quickstart
To start all Docker containers at once, please run below command:
To start all Docker containers at once, please run below command from project root directory:
```bash
cd docker/quickstart && source ./quickstart.sh
./docker/quickstart/quickstart.sh
```

By default, data will be stored at `/tmp/datahub`, however it can be overwritten by specifying the DATA_STORAGE_FOLDER env var.

At this point, all containers are ready and DataHub can be considered up and running. Check specific containers guide
for details:
* [Elasticsearch & Kibana](../elasticsearch)
Expand Down
14 changes: 10 additions & 4 deletions docker/quickstart/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ services:
- "3306:3306"
volumes:
- ../mysql/init.sql:/docker-entrypoint-initdb.d/init.sql
- ${DATA_STORAGE_FOLDER}/mysql:/var/lib/mysql
- mysqldata:/var/lib/mysql

zookeeper:
image: confluentinc/cp-zookeeper:5.4.0
Expand All @@ -28,7 +28,7 @@ services:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
volumes:
- ${DATA_STORAGE_FOLDER}/zookeeper:/var/opt/zookeeper
- zkdata:/var/opt/zookeeper

broker:
image: confluentinc/cp-kafka:5.4.0
Expand Down Expand Up @@ -137,7 +137,7 @@ services:
- xpack.security.enabled=false
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
volumes:
- ${DATA_STORAGE_FOLDER}/elasticsearch:/usr/share/elasticsearch/data
- esdata:/usr/share/elasticsearch/data

kibana:
image: docker.elastic.co/kibana/kibana:5.6.8
Expand All @@ -161,7 +161,7 @@ services:
- "7474:7474"
- "7687:7687"
volumes:
- ${DATA_STORAGE_FOLDER}/neo4j:/data
- neo4jdata:/data

# This "container" is a workaround to pre-create search indices
elasticsearch-setup:
Expand Down Expand Up @@ -261,3 +261,9 @@ services:
networks:
default:
name: datahub_network

volumes:
mysqldata:
esdata:
neo4jdata:
zkdata:
10 changes: 2 additions & 8 deletions docker/quickstart/quickstart.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
#!/bin/bash

export DATA_STORAGE_FOLDER=${DATA_STORAGE_FOLDER:=/tmp/datahub}
mkdir -p ${DATA_STORAGE_FOLDER}

# https://discuss.elastic.co/t/elastic-elasticsearch-docker-not-assigning-permissions-to-data-directory-on-run/65812/4
mkdir -p ${DATA_STORAGE_FOLDER}/elasticsearch
sudo chmod 777 ${DATA_STORAGE_FOLDER}/elasticsearch

docker-compose pull && docker-compose up --build
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
cd $DIR && docker-compose pull && docker-compose -p datahub up --build
4 changes: 2 additions & 2 deletions docs/debugging.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,9 +179,9 @@ More discussions on the same issue https://github.com/docker/hub-feedback/issues
```
docker rm -f $(docker ps -aq)
```
2. Clear persistent storage for DataHub containers, assuming you didn't set `DATA_STORAGE_FOLDER` environment variable.
2. Drop all DataHub's docker volumes.
```
rm -rf /tmp/datahub
docker volume rm -f $(docker volume ls -f name=datahub_* -q)
```

## Seeing `Table 'datahub.metadata_aspect' doesn't exist` error when logging in
Expand Down

0 comments on commit c515cb7

Please sign in to comment.