Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for kubernetes helm packaging #1609

Merged
merged 1 commit into from
Mar 27, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 85 additions & 23 deletions contrib/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,102 @@
# Kubernetes Setup for DataHub

## Introduction
This directory provides the Kubernetes setup for DataHub. This is the first version with simple YAML files.
The next version will contain DataHub [Helm](https://helm.sh/) chart that can be published to [Helm Hub](https://hub.helm.sh/)
This directory provides the Kubernetes [Helm](https://helm.sh/) charts for DataHub.

## Setup
This kubernetes deployment doesn't contain the below artifacts. The idea is to use the original helm charts for deploying each of these separately.

* Kafka and Schema Registry [Chart Link](https://github.com/confluentinc/cp-helm-charts/tree/master/charts/cp-kafka)
* Kafka and Schema Registry [Chart Link](https://hub.helm.sh/charts/incubator/kafka)
* Elasticsearch [Chart Link](https://hub.helm.sh/charts/elastic/elasticsearch)
* Mysql [Chart Link](https://hub.helm.sh/charts/stable/mysql)
* Neo4j [Chart Link](https://hub.helm.sh/charts/stable/neo4j)

Also, these can be installed on-prem or can be leveraged as managed service on any cloud platform.

## Quickstart
1. Install Docker and Kubernetes
2. Update the values in the configmap (datahub-configmap.yaml) with Docker hostname. For example
```
ebean.datasource.host: "192.168.0.104:3306"
ebean.datasource.url: "jdbc:mysql://192.168.0.104:3306/datahub?verifyServerCertificate=false&useSSL=true"
kafka.bootstrap.server: "192.168.0.104:29092"
kafka.schemaregistry.url: "http://192.168.0.104:8081"
elasticsearch.host: "192.168.0.104"
neo4j.uri: "bolt://192.168.0.104"
```
3. Create the configmap by running the following
```
kubectl apply -f datahub-configmap.yaml
```
4. Run the below kubectl command
```
cd .. && kubectl apply -f kubernetes/
```
Please note that these steps will be updated once it is made into a Helm chart.

### Docker & Kubernetes
Install Docker & Kubernetes by following the instructions [here](https://kubernetes.io/docs/setup/). Easiest is to use Docker Desktop for your platform [Mac](https://docs.docker.com/docker-for-mac/) & [Windows](https://docs.docker.com/docker-for-windows/)

### Helm
Helm is an open-source packaging tool that helps you install applications and services on kubernetes. Helm uses a packaging format called charts. Charts are a collection of YAML templates that describes a related set of kubernetes resources.

Install helm by following the instructions [here](https://helm.sh/docs/intro/install/). We support Helm3 version.

### DataHub Helm Chart Configurations

The following table lists the configuration parameters and its default values

#### Chart Requirements

| Repository | Name | Version |
|------------|------|---------|
| file://./charts/datahub-frontend | datahub-frontend | 0.1.0 |
| file://./charts/datahub-gms | datahub-gms | 0.1.0 |
| file://./charts/datahub-mae-consumer | datahub-mae-consumer | 0.1.0 |
| file://./charts/datahub-mce-consumer | datahub-mce-consumer | 0.1.0 |

#### Chart Values

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| datahub-frontend.enabled | bool | `true` | |
| datahub-frontend.image.repository | string | `"keremsahin/datahub-frontend"` | |
| datahub-frontend.image.tag | string | `"latest"` | |
| datahub-gms.enabled | bool | `true` | |
| datahub-gms.image.repository | string | `"keremsahin/datahub-gms"` | |
| datahub-gms.image.tag | string | `"latest"` | |
| datahub-mae-consumer.enabled | bool | `true` | |
| datahub-mae-consumer.image.repository | string | `"keremsahin/datahub-mae-consumer"` | |
| datahub-mae-consumer.image.tag | string | `"latest"` | |
| datahub-mce-consumer.enabled | bool | `true` | |
| datahub-mce-consumer.image.repository | string | `"keremsahin/datahub-mce-consumer"` | |
| datahub-mce-consumer.image.tag | string | `"latest"` | |
| global.datahub.appVersion | string | `"1.0"` | |
| global.datahub.gms.host | string | `"datahub-gms-deployment"` | |
| global.datahub.gms.port | string | `"8080"` | |
| global.datahub.gms.secret | string | `"YouKnowNothing"` | |
| global.elasticsearch.host | string | `"elasticsearch"` | |
| global.elasticsearch.port | string | `"9200"` | |
| global.hostAliases[0].hostnames[0] | string | `"broker"` | |
| global.hostAliases[0].hostnames[1] | string | `"mysql"` | |
| global.hostAliases[0].hostnames[2] | string | `"elasticsearch"` | |
| global.hostAliases[0].hostnames[3] | string | `"neo4j"` | |
| global.hostAliases[0].ip | string | `"192.168.0.104"` | |
| global.kafka.bootstrap.server | string | `"broker:29092"` | |
| global.kafka.schemaregistry.url | string | `"http://schema-registry:8081"` | |
| global.neo4j.password | string | `"datahub"` | |
| global.neo4j.uri | string | `"bolt://neo4j"` | |
| global.neo4j.username | string | `"neo4j"` | |
| global.sql.datasource.driver | string | `"com.mysql.jdbc.Driver"` | |
| global.sql.datasource.host | string | `"mysql"` | |
| global.sql.datasource.password | string | `"datahub"` | |
| global.sql.datasource.url | string | `"jdbc:mysql://mysql:3306/datahub?verifyServerCertificate=false\u0026useSSL=true"` | |
| global.sql.datasource.username | string | `"datahub"` | |

## Install DataHub
Navigate to the current directory and run the below command. Update the `datahub/values.yaml` file with valid hostname/IP address configuration for elasticsearch, neo4j, schema-registry, broker & mysql.

``
helm install datahub datahub/
``

## Testing
For testing this setup, we can use the existing quickstart's [docker-compose](https://github.com/linkedin/datahub/blob/master/docker/quickstart/docker-compose.yml) file but commenting out `data-hub-gms`, `datahub-frontend`, `datahub-mce-consumer` & `datahub-mae-consumer` sections.
For testing this setup, we can use the existing quickstart's [docker-compose](https://github.com/linkedin/datahub/blob/master/docker/quickstart/docker-compose.yml) file but commenting out `data-hub-gms`, `datahub-frontend`, `datahub-mce-consumer` & `datahub-mae-consumer` sections for setting up prerequisite software
and then performing helm install by updating the values.yaml with proper IP address of Host Machine for elasticsearch, neo4j, schema-registry, broker & mysql in `global.hostAliases[0].ip` section.


Alternatively, you can run this command directly without making any changes to `datahub/values.yaml` file
``
helm install --set "global.hostAliases[0].ip"="<<docker_host_ip>>","global.hostAliases[0].hostnames"="{broker,mysql,elasticsearch,neo4j}" datahub datahub/
``

## Other useful commands

| Command | Description |
|-----|------|
| helm uninstall datahub | Remove DataHub |
| helm ls | List of Helm charts |
| helm history | Fetch a release history |


24 changes: 24 additions & 0 deletions contrib/kubernetes/datahub/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
charts/*.tgz
27 changes: 27 additions & 0 deletions contrib/kubernetes/datahub/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
apiVersion: v2
name: datahub
description: A Helm chart for LinkedIn DataHub
type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
version: 0.0.1
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application.
appVersion: latest #0.3.1
dependencies:
- name: datahub-gms
version: 0.1.0
repository: file://./charts/datahub-gms
condition: datahub-gms.enabled
- name: datahub-frontend
version: 0.1.0
repository: file://./charts/datahub-frontend
condition: datahub-frontend.enabled
- name: datahub-mae-consumer
version: 0.1.0
repository: file://./charts/datahub-mae-consumer
condition: datahub-mae-consumer.enabled
- name: datahub-mce-consumer
version: 0.1.0
repository: file://./charts/datahub-mce-consumer
condition: datahub-mce-consumer.enabled
52 changes: 52 additions & 0 deletions contrib/kubernetes/datahub/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
datahub
=======
A Helm chart for LinkedIn DataHub

Current chart version is `0.0.1`

## Chart Requirements

| Repository | Name | Version |
|------------|------|---------|
| file://./charts/datahub-frontend | datahub-frontend | 0.1.0 |
| file://./charts/datahub-gms | datahub-gms | 0.1.0 |
| file://./charts/datahub-mae-consumer | datahub-mae-consumer | 0.1.0 |
| file://./charts/datahub-mce-consumer | datahub-mce-consumer | 0.1.0 |

## Chart Values

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| datahub-frontend.enabled | bool | `true` | |
| datahub-frontend.image.repository | string | `"keremsahin/datahub-frontend"` | |
| datahub-frontend.image.tag | string | `"latest"` | |
| datahub-gms.enabled | bool | `true` | |
| datahub-gms.image.repository | string | `"keremsahin/datahub-gms"` | |
| datahub-gms.image.tag | string | `"latest"` | |
| datahub-mae-consumer.enabled | bool | `true` | |
| datahub-mae-consumer.image.repository | string | `"keremsahin/datahub-mae-consumer"` | |
| datahub-mae-consumer.image.tag | string | `"latest"` | |
| datahub-mce-consumer.enabled | bool | `true` | |
| datahub-mce-consumer.image.repository | string | `"keremsahin/datahub-mce-consumer"` | |
| datahub-mce-consumer.image.tag | string | `"latest"` | |
| global.datahub.appVersion | string | `"1.0"` | |
| global.datahub.gms.host | string | `"datahub-gms-deployment"` | |
| global.datahub.gms.port | string | `"8080"` | |
| global.datahub.gms.secret | string | `"YouKnowNothing"` | |
| global.elasticsearch.host | string | `"elasticsearch"` | |
| global.elasticsearch.port | string | `"9200"` | |
| global.hostAliases[0].hostnames[0] | string | `"broker"` | |
| global.hostAliases[0].hostnames[1] | string | `"mysql"` | |
| global.hostAliases[0].hostnames[2] | string | `"elasticsearch"` | |
| global.hostAliases[0].hostnames[3] | string | `"neo4j"` | |
| global.hostAliases[0].ip | string | `"192.168.0.104"` | |
| global.kafka.bootstrap.server | string | `"broker:29092"` | |
| global.kafka.schemaregistry.url | string | `"http://schema-registry:8081"` | |
| global.neo4j.password | string | `"datahub"` | |
| global.neo4j.uri | string | `"bolt://neo4j"` | |
| global.neo4j.username | string | `"neo4j"` | |
| global.sql.datasource.driver | string | `"com.mysql.jdbc.Driver"` | |
| global.sql.datasource.host | string | `"mysql"` | |
| global.sql.datasource.password | string | `"datahub"` | |
| global.sql.datasource.url | string | `"jdbc:mysql://mysql:3306/datahub?verifyServerCertificate=false\u0026useSSL=true"` | |
| global.sql.datasource.username | string | `"datahub"` | |
23 changes: 23 additions & 0 deletions contrib/kubernetes/datahub/charts/datahub-frontend/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
21 changes: 21 additions & 0 deletions contrib/kubernetes/datahub/charts/datahub-frontend/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
apiVersion: v2
name: datahub-frontend
description: A Helm chart for Kubernetes

# A chart can be either an 'application' or a 'library' chart.
#
# Application charts are a collection of templates that can be packaged into versioned archives
# to be deployed.
#
# Library charts provide useful utilities or functions for the chart developer. They're included as
# a dependency of application charts to inject those utilities and functions into the rendering
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
type: application

# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
version: 0.1.0

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application.
appVersion: 0.3.1
37 changes: 37 additions & 0 deletions contrib/kubernetes/datahub/charts/datahub-frontend/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
datahub-frontend
================
A Helm chart for datahub-frontend

Current chart version is `0.1.0`

## Chart Values

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| affinity | object | `{}` | |
| datahub.play.mem.buffer.size | string | `"10MB"` | |
| fullnameOverride | string | `"datahub-frontend"` | |
| global.datahub.gms.host | string | `"datahub-gms-deployment"` | |
| global.datahub.gms.port | string | `"8080"` | |
| global.datahub.gms.secret | string | `"YouKnowNothing"` | |
| image.pullPolicy | string | `"IfNotPresent"` | |
| image.repository | string | `"keremsahin/datahub-frontend"` | |
| image.tag | string | `"latest"` | |
| imagePullSecrets | list | `[]` | |
| ingress.annotations | object | `{}` | |
| ingress.enabled | bool | `false` | |
| ingress.hosts[0].host | string | `"chart-example.local"` | |
| ingress.hosts[0].paths | list | `[]` | |
| ingress.tls | list | `[]` | |
| nameOverride | string | `""` | |
| nodeSelector | object | `{}` | |
| podSecurityContext | object | `{}` | |
| replicaCount | int | `1` | |
| resources | object | `{}` | |
| securityContext | object | `{}` | |
| service.port | int | `9001` | |
| service.type | string | `"LoadBalancer"` | |
| serviceAccount.annotations | object | `{}` | |
| serviceAccount.create | bool | `true` | |
| serviceAccount.name | string | `nil` | |
| tolerations | list | `[]` | |
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
1. Get the application URL by running these commands:
{{- if .Values.ingress.enabled }}
{{- range $host := .Values.ingress.hosts }}
{{- range .paths }}
http{{ if $.Values.ingress.tls }}s{{ end }}://{{ $host.host }}{{ . }}
{{- end }}
{{- end }}
{{- else if contains "NodePort" .Values.service.type }}
export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "datahub-frontend.fullname" . }})
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
{{- else if contains "LoadBalancer" .Values.service.type }}
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get --namespace {{ .Release.Namespace }} svc -w {{ include "datahub-frontend.fullname" . }}'
export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "datahub-frontend.fullname" . }} --template "{{"{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}"}}")
echo http://$SERVICE_IP:{{ .Values.service.port }}
{{- else if contains "ClusterIP" .Values.service.type }}
export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "app.kubernetes.io/name={{ include "datahub-frontend.name" . }},app.kubernetes.io/instance={{ .Release.Name }}" -o jsonpath="{.items[0].metadata.name}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl --namespace {{ .Release.Namespace }} port-forward $POD_NAME 8080:80
{{- end }}
Loading