Skip to content

Commit

Permalink
fix: ingestion docker image (#2027)
Browse files Browse the repository at this point in the history
The environment was not set correctly, so it could not fire kafka events. It (mce-cli) always worked when running outside of docker.

I also added a dev ingestion docker image / script which may be much faster if you've already built locally.

Tested:
1. Cleaned docker volumes and started datahub. Verified it is empty.
2. Built with gradle.
3. Ran ./docker/ingestion/ingestion-dev.sh. Verified data shows in DataHub.
4. Ran step 1 again.
5. Ran ./docker/ingestion/ingestion.sh. Verified data shows in DataHub.
  • Loading branch information
John Plaisted authored Dec 3, 2020
1 parent a1e7e26 commit 5f9d967
Show file tree
Hide file tree
Showing 6 changed files with 51 additions and 6 deletions.
21 changes: 16 additions & 5 deletions docker/ingestion/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,19 @@
FROM openjdk:8 as builder
# Defining environment
ARG APP_ENV=prod

FROM openjdk:8-jre-alpine as base

FROM openjdk:8 as prod-build
COPY . datahub-src
RUN cd datahub-src && ./gradlew :metadata-ingestion-examples:mce-cli:build

FROM openjdk:8-jre-alpine
COPY --from=builder datahub-src/metadata-ingestion-examples/mce-cli/build/libs/mce-cli.jar ./
COPY --from=builder datahub-src/metadata-ingestion-examples/mce-cli/example-bootstrap.json ./
CMD java -jar mce-cli.jar -m produce example-bootstrap.json
FROM base as prod-install
COPY --from=prod-build datahub-src/metadata-ingestion-examples/mce-cli/build/libs/mce-cli.jar /datahub/ingestion/bin/mce-cli.jar
COPY --from=prod-build datahub-src/metadata-ingestion-examples/mce-cli/example-bootstrap.json /datahub/ingestion/example-bootstrap.json

FROM base as dev-install
# Dummy stage for development. Assumes code is built on your machine and mounted to this image.
# See this excellent thread https://github.com/docker/cli/issues/1134

FROM ${APP_ENV}-install as final
CMD java -jar /datahub/ingestion/bin/mce-cli.jar -m produce /datahub/ingestion/example-bootstrap.json
18 changes: 18 additions & 0 deletions docker/ingestion/docker-compose.dev.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
version: '3.5'
services:
ingestion:
image: datahub-ingestion:debug
env_file: env/docker.env
build:
context: .
dockerfile: Dockerfile
args:
APP_ENV: dev
volumes:
- ../../metadata-ingestion-examples/mce-cli/build/libs/:/datahub/ingestion/bin
- ../../metadata-ingestion-examples/mce-cli/example-bootstrap.json:/datahub/ingestion/example-bootstrap.json

networks:
default:
name: datahub_network
1 change: 1 addition & 0 deletions docker/ingestion/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ version: '3.5'
services:
ingestion:
image: datahub-ingestion
env_file: env/docker.env
build:
context: ../../
dockerfile: docker/ingestion/Dockerfile
Expand Down
2 changes: 2 additions & 0 deletions docker/ingestion/env/docker.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
KAFKA_BOOTSTRAP_SERVER=broker:29092
KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
6 changes: 6 additions & 0 deletions docker/ingestion/ingestion-dev.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash

# Runs the ingestion image using your locally built mce-cli. Gradle build must have been run before this script.

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
cd $DIR && COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -f docker-compose.dev.yml -p datahub up
9 changes: 8 additions & 1 deletion metadata-ingestion-examples/mce-cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,4 +48,11 @@ java -jar metadata-ingestion-examples/mce-cli/build/libs/mce-cli.jar -m produce
```

Where `my-file.json` is some file that contains a
[MetadataChangEvents](./src/main/pegasus/com/linkedin/metadata/examples/cli/MetadataChangeEvents.pdl) JSON object.
[MetadataChangEvents](./src/main/pegasus/com/linkedin/metadata/examples/cli/MetadataChangeEvents.pdl) JSON object.

### Producing the Example Events with Docker

We have some example events in the `example-bootstrap.json` file, which can be invoked via the above example or in a
docker environment using `docker/ingestion/ingestion.sh`. We also have a developer image
(`docker/ingestion/ingestion-dev.sh`) which uses your locally built jar rather than building on the docker image itself,
which may be faster if you have already built code locally.

0 comments on commit 5f9d967

Please sign in to comment.