Amundsen k8s Helm Charts

Source code can be found here

What is this?

This is setup templates for deploying amundsen on k8s (kubernetes), using helm.

How do I get started?

  1. Make sure you have the following command line clients setup:
    • k8s (kubectl)
    • helm
  2. Build out a cloud based k8s cluster, such as Amazon EKS
  3. Ensure you can connect to your cluster with cli tools in step 1.

Prerequisites

  1. Helm 2.14+
  2. Kubernetes 1.14+

Chart Requirements

Repository Name Version
https://kubernetes-charts.storage.googleapis.com/ elasticsearch 1.24.0

Chart Values

The following table lists the configurable parameters of the Amundsen charts and their default values.

Key Type Default Description
LONG_RANDOM_STRING int 1234 A long random string. You should probably provide your own. This is needed for OIDC.
affinity object {} amundsen application wide configuration of affinity. This applies to search, metadata, frontend and neo4j. Elasticsearch has it’s own configuation properties for this. ref
dnsZone string "teamname.company.com" DEPRECATED - its not standard to pre construct urls this way. The dns zone (e.g. group-qa.myaccount.company.com) the app is running in. Used to construct dns hostnames (on aws only).
dockerhubImagePath string "amundsendev" DEPRECATED - this is not useful, it would be better to just allow the whole image to be swapped instead. The image path for dockerhub.
elasticsearch.client.replicas int 1 only running amundsen on 1 client replica
elasticsearch.cluster.env.EXPECTED_MASTER_NODES int 1 required to match master.replicas
elasticsearch.cluster.env.MINIMUM_MASTER_NODES int 1 required to match master.replicas
elasticsearch.cluster.env.RECOVER_AFTER_MASTER_NODES int 1 required to match master.replicas
elasticsearch.data.replicas int 1 only running amundsen on 1 data replica
elasticsearch.enabled bool true set this to false, if you want to provide your own ES instance.
elasticsearch.master.replicas int 1 only running amundsen on 1 master replica
environment string "dev" DEPRECATED - its not standard to pre construct urls this way. The environment the app is running in. Used to construct dns hostnames (on aws only) and ports.
frontEnd.OIDC_AUTH_SERVER_ID string nil The authorization server id for OIDC.
frontEnd.OIDC_CLIENT_ID string nil The client id for OIDC.
frontEnd.OIDC_CLIENT_SECRET string "" The client secret for OIDC.
frontEnd.OIDC_ORG_URL string nil The organization URL for OIDC.
frontEnd.affinity object {} Frontend pod specific affinity.
frontEnd.createOidcSecret bool false OIDC needs some configuration. If you want the chart to make your secrets, set this to true and set the next four values. If you don’t want to configure your secrets via helm, you can still use the amundsen-oidc-config.yaml as a template
frontEnd.imageVersion string "2.0.0" The frontend version of the metadata container.
frontEnd.nodeSelector object {} Frontend pod specific nodeSelector.
frontEnd.oidcEnabled bool false To enable auth via OIDC, set this to true.
frontEnd.replicas int 1 How many replicas of the frontend service to run.
frontEnd.resources object {} See pod resourcing ref
frontEnd.serviceName string "frontend" The frontend service name.
frontEnd.servicePort int 80 The port the frontend service will be exposed on via the loadbalancer.
frontEnd.tolerations list [] Frontend pod specific tolerations.
metadata.affinity object {} Metadata pod specific affinity.
metadata.imageVersion string "2.0.0" The image version of the metadata container.
metadata.neo4jEndpoint string nil The name of the service hosting neo4j on your cluster, if you bring your own. You should only need to change this, if you don’t use the version in this chart.
metadata.nodeSelector object {} Metadata pod specific nodeSelector.
metadata.replicas int 1 How many replicas of the metadata service to run.
metadata.resources object {} See pod resourcing ref
metadata.serviceName string "metadata" The metadata service name.
metadata.tolerations list [] Metadata pod specific tolerations.
neo4j.affinity object {} neo4j specific affinity.
neo4j.backup object {"enabled":false,"s3Path":"s3://dev/null","schedule":"0 * * * *"} If enabled is set to true, make sure and set the s3 path as well.
neo4j.backup.s3Path string "s3://dev/null" The s3path to write to for backups.
neo4j.backup.schedule string "0 * * * *" The schedule to run backups on. Defaults to hourly.
neo4j.config object {"dbms":{"heap_initial_size":"23000m","heap_max_size":"23000m","pagecache_size":"26600m"}} Neo4j application specific configuration. This type of configuration is why the charts/stable version is not used. See ref
neo4j.config.dbms object {"heap_initial_size":"23000m","heap_max_size":"23000m","pagecache_size":"26600m"} dbms config for neo4j
neo4j.config.dbms.heap_initial_size string "23000m" the initial java heap for neo4j
neo4j.config.dbms.heap_max_size string "23000m" the max java heap for neo4j
neo4j.config.dbms.pagecache_size string "26600m" the page cache size for neo4j
neo4j.enabled bool true If neo4j is enabled as part of this chart, or not. Set this to false if you want to provide your own version.
neo4j.nodeSelector object {} neo4j specific nodeSelector.
neo4j.persistence object {} Neo4j persistence. Turn this on to keep your data between pod crashes, etc. This is also needed for backups.
neo4j.resources object {} See pod resourcing ref
neo4j.tolerations list [] neo4j specific tolerations.
neo4j.version string "3.3.0" The neo4j application version used by amundsen.
nodeSelector object {} amundsen application wide configuration of nodeSelector. This applies to search, metadata, frontend and neo4j. Elasticsearch has it’s own configuation properties for this. ref
provider string "aws" The cloud provider the app is running in. Used to construct dns hostnames (on aws only).
search.affinity object {} Search pod specific affinity.
search.elasticsearchEndpoint string nil The name of the service hosting elasticsearch on your cluster, if you bring your own. You should only need to change this, if you don’t use the version in this chart.
search.imageVersion string "2.0.0" The image version of the search container.
search.nodeSelector object {} Search pod specific nodeSelector.
search.replicas int 1 How many replicas of the search service to run.
search.resources object {} See pod resourcing ref
search.serviceName string "search" The search service name.
search.tolerations list [] Search pod specific tolerations.
tolerations list [] amundsen application wide configuration of tolerations. This applies to search, metadata, frontend and neo4j. Elasticsearch has it’s own configuation properties for this. ref

Neo4j DBMS Config?

You may want to override the default memory usage for Neo4J. In particular, if you’re just test-driving a deployment and your node exits with status 137, you should set the usage to smaller values:

config:
  dbms:
    heap_initial_size: 2Gi
    heap_max_size: 2Gi
    pagecache_size: 2Gi

With this values file, you can then install Amundsen using Helm 2 with:

helm install ./templates/helm --values impl/helm/dev/values.yaml

For Helm 3 it’s now mandatory to specify a chart reference name e.g. my-amundsen:

helm install my-amundsen ./templates/helm --values impl/helm/dev/values.yaml

Other Notes