Galaxy Helm Chart (v4)

Galaxy is a data analysis platform focusing on accessibility, reproducibility, and transparency of primarily bioinformatics data. This repo contains Helm charts for easily deploying Galaxy on top of Kubernetes. The chart allows application configuration changes, updates, upgrades, and rollbacks.

You may follow this documentation on how to use this Helm chart to deploy Galaxy on various managed kubernetes services (e.g., Amazon EKS and Google GKE).

Recommended versions

Kubernetes 1.16+
Helm 3.5+

Helm 2 note

Support for Helm 2 has been discontinued and users must upgrade to Helm 3 to use these charts.

Kubernetes cluster

You will need kubectl (instructions) and Helm (instructions) installed.

In terms of getting a Kubernetes cluster, an easy option for testing and development purposes is to install Docker Desktop, which comes with integrated Kubernetes.

Another out-of-the box option is k3d which runs a k3s cluster.

Note: The CVMFS-CSI driver used for reference data unfortunately does not work on a Mac at the moment.

Dependency charts

This chart relies on the features of other charts for common functionality:

postgres-operator for the database;
CVMFS-CSI chart for linking the reference data to Galaxy and jobs. While, technically, CVMFS is an optional dependency, production settings will likely want it enabled.

Note: It is not advisable to run multiple instances of the CVMFS-CSI simultaneously on the same cluster. If you wish to deploy multiple instances of Galaxy on the same cluster, please install the CVMFS-CSI chart separately as shown below. One exception to this is installing multiple releases of Galaxy in different namespaces AND running on different nodepools. In that case, it is possible to have each Galaxy release deploy its own CVMFS-CSI (and own NFS provisioner if desired). For that case, please refer to the GalaxyKubeMan Chart.

In a production setting, especially if the intention is to run multiple Galaxies in a single cluster, we recommend installing these charts separately once per cluster, and installing Galaxy with --set postgresql.deploy=false --set cvmfs.deploy=false --set cvmfs.enabled=true.

TL;DR

Default simple installation (with only a few basic Galaxy tools)

Launching from the source:

git clone https://github.com/galaxyproject/galaxy-helm.git
cd galaxy-helm/galaxy
helm dependency update
helm install my-galaxy-release .

Launching from the repository of packaged charts:

helm repo add cloudve https://raw.githubusercontent.com/CloudVE/helm-charts/master/
helm repo update
helm install my-galaxy-release cloudve/galaxy

Example installation for a single Galaxy instance with CVMFS

helm repo add cloudve https://raw.githubusercontent.com/CloudVE/helm-charts/master/
helm repo update
helm install my-galaxy-release cloudve/galaxy --set cvmfs.enabled=true --set cvmfs.deploy=true

Example installation for multiple Galaxy instances on the same cluster

helm repo add cloudve https://raw.githubusercontent.com/CloudVE/helm-charts/master/
helm repo update
helm install cvmfs cloudve/galaxy-cvmfs-csi --namespace cvmfs --create-namespace
helm install my-galaxy-release-1 cloudve/galaxy --set cvmfs.enabled=true --set cvmfs.deploy=false --set ingress.path="/galaxy1/"
helm install my-galaxy-release-2 cloudve/galaxy --set cvmfs.enabled=true --set ingress.path="/galaxy2/"

Note: cvmfs.deploy defaults to false. The explicit mention in the first release is purely visual to highlight the difference.

Installing the chart

Clone this repository and install the required dependency charts.

git clone https://github.com/galaxyproject/galaxy-helm.git
cd galaxy-helm/galaxy
helm dependency update

To install the chart with the release name my-galaxy (note the trailing dot):

helm install my-galaxy .

In about a minute, Galaxy will be available at the root URL of your Kubernetes cluster.

Uninstalling the chart

To uninstall/delete the galaxy deployment, run:

helm delete my-galaxy

Configuration

The following table lists the configurable parameters of the Galaxy chart. The current default values can be found in values.yaml file.

Parameters	Description
`nameOverride`	Override the name of the chart used to prefix resource names. Defaults to `{{.Chart.Name}}` (i.e. `galaxy`)
`fullnameOverride`	Override the full name used to prefix resource names. Defaults to `{{.Release.Name}}-{{.Values.nameOverride}}`
`image.pullPolicy`	Galaxy image pull policy for more info
`image.repository`	The repository and name of the Docker image for Galaxy, searches Docker Hub by default
`image.tag`	Galaxy Docker image tag (generally corresponds to the desired Galaxy version)
`imagePullSecrets`	Secrets used to access a Galaxy image from a private repository
`persistence.enabled`	Enable persistence using PVC
`persistence.size`	PVC storage request for the Galaxy volume, in GB
`persistence.accessMode`	PVC access mode for the Galaxy volume
`persistence.annotations.{}`	Dictionary of annotations to add to the persistent volume claim's metadata
`persistence.existingClaim`	Use existing Persistent Volume Claim instead of creating one
`persistence.storageClass`	Storage class to use for provisioning the Persistent Volume Claim
`persistence.name`	Name of the PVC
`persistence.mountPath`	Path where to mount the Galaxy volume
`useSecretConfigs`	Enable Kubernetes Secrets for all config maps
`configs.{}`	Galaxy configuration files and values for each of the files. The provided value represent the entire content of the given configuration file
`jobs.priorityClass.enabled`	Assign a priorityClass to the dispatched jobs.
`jobs.rules`	Galaxy dynamic job rules. See `values.yaml`
`jobs.priorityClass.existingClass`	Use an existing priorityClass to assign if `jobs.priorityClass.enabled=true`
`cvmfs.deploy`	Deploy the Galaxy-CVMFS-CSI Helm Chart. This is an optional dependency, and for production scenarios it should be deployed separately as a cluster-wide resource
`cvmfs.enabled`	Enable use of CVMFS in configs, and deployment of CVMFS Persistent Volume Claims for Galaxy
`cvmfs.galaxyPersistentVolumeClaims.{}`	Persistent Volume Claims to deploy for CVMFS repositories. See `values.yaml` for examples.
`setupJob.ttlSecondsAfterFinished`	Sets `ttlSecondsAfterFinished` for the initialization jobs. See the Kubernetes documentation for more details.
`setupJob.downloadToolConfs.enabled`	Download configuration files and the `tools` directory from an archive via a job at startup
`setupJob.downloadToolConfs.archives.startup`	A URL to a `tar.gz` publicly accessible archive containing AT LEAST conf files and XML tool wrappers. Meant to be enough for Galaxy handlers to startup.
`setupJob.downloadToolConfs.archives.running`	A URL to a `tar.gz` publicly accessible archive containing AT LEAST confs, tool wrappers, and tool scripts but excluding test data. Meant to be enough for Galaxy handlers to run jobs.
`setupJob.downloadToolConfs.archives.full`	A URL to a `tar.gz` publicly accessible archive containing the full `tools` directory, including each tool's test data. Meant to be enough to run automated tool-tests, fully mimicking CVMFS repositories
`setupJob.downloadToolConfs.volume.mountPath`	Path at which to mount the unarchived confs in the each handler (should match path set in the tool confs)
`setupJob.downloadToolConfs.volume.subPath`	Name of subdirectory on Galaxy's shared filesystem to use for the unarchived configs
`setupJob.createDatabase`	Deploy a job to create a Galaxy database from scratch (does not affect subsequent upgrades, only first startup)
`ingress.path`	Path where Galaxy application will be hosted
`ingress.annotations.{}`	Dictionary of annotations to add to the ingress's metadata at the deployment level
`ingress.hosts`	Hosts for the Galaxy ingress
`ingress.canary.enabled`	This will create an additional ingress for detecting activity on Galaxy. Useful for autoscaling on activity.
`ingress.enabled`	Enable Kubernetes ingress
`ingress.tls`	Ingress configuration with HTTPS support
`service.nodePort`	If `service.type` is set to `NodePort`, then this can be used to set the port at which Galaxy will be available on all nodes' IP addresses
`service.port`	Kubernetes service port
`service.type`	Kubernetes Service type
`serviceAccount.annotations.{}`	Dictionary of annotations to add to the service account's metadata
`serviceAccount.create`	The serviceAccount will be created if it does not exist.
`serviceAccount.name`	The serviceAccount account to use.
`rbac.enabled`	Enable Galaxy job RBAC. This will grant the service account the necessary permissions/roles to view jobs and pods in this namespace. Defaults to true.
`webHandlers.{}`	Configuration for the web handlers (See table below for all options)
`jobHandlers.{}`	Configuration for the job handlers (See table below for all options)
`workflowHandlers.{}`	Configuration for the workflow handlers (See table below for all options)
`resources.limits.memory`	The maximum memory that can be allocated.
`resources.requests.memory`	The requested amount of memory.
`resources.limits.cpu`	The maximum CPU that can be alloacted.
`resources.limits.ephemeral-storage`	The maximum ephemeral storage that can be allocated.
`resources.requests.cpu`	The requested amount of CPU (as time or number of cores)
`resources.requests.ephemeral-storage`	The requested amount of ephemeral storage
`securityContext.fsGroup`	The group for any files created.
`tolerations`	Define the `taints` that are tolerated.
`extraFileMappings.{}`	Add extra files mapped as configMaps or Secrets at arbitrary paths. See `values.yaml` for examples.
`extraInitCommands`	Extra commands that will be run during initialization.
`extraInitContainers.[]`	A list of extra init containers for the handler pods
`extraVolumeMounts.[]`	List of volumeMounts to add to all handlers
`extraVolumes.[]`	List of volumes to add to all handlers
`postgresql.enabled`	Enable the postgresql condition in the requirements.yml.
`influxdb.username`	Influxdb user name.
`influxdb.url`	The connection URL to in the `influxdb`
`influxdb.enabled`	Enable the `influxdb` used by the metrics scraper.
`influxdb.password`	Password for the influxdb user.
`metrics.podAnnotations.{}`	Dictionary of annotations to add to the metrics deployment's metadata at the pod level
`metrics.image.repository`	The location of the galay-metrics-scraping image to use.
`metrics.image.pullPolicy`	Define the pull policy, that is, when Kubernetes will pull the image.
`metrics.podSpecExtra.{}`	Dictionary to add to the metrics deployment's pod template under `spec`
`metrics.image.tag`	The image version to use.
`metrics.annotations.{}`	Dictionary of annotations to add to the metrics deployment's metadata at the deployment level
`metrics.enabled`	Enable metrics gathering. The influxdb setting must be specified when using this setting.
`nginx.conf.client_max_body_size`	Requests larger than this size will result in a `413 Payload Too Large`.
`nginx.image.tag`	The Nginx version to pull.
`nginx.image.repository`	Where to obtain the Nginx container.
`nginx.image.pullPolicy`	When Kubernetes will pull the Nginx image from the repository.
`nginx.galaxyStaticDir`	Location at which to copy Galaxy static content in the NGINX pod init container, for direct serving. Defaults to `/galaxy/server/static`

Handlers

Galaxy defines three handler types: jobHandlers, webHandlers, and workflowHandlers. All three handler types share common configuration options.

Parameter	Description
`replicaCount`	The number of handlers to be spawned.
`startupDelay`	Delay in seconds for handler startup. Used to offset handlers and avoid race conditions at first startup
`annotations`	Dictionary of annotations to add to this handler's metadata at the deployment level
`podAnnotations`	Dictionary of annotations to add to this handler's metadata at the pod level
`podSpecExtra`	Dictionary to add to this handler's pod template under `spec`
`startupProbe`	Probe used to determine if a pod has started. Other probes wait for the startup probe. See table below for all probe options
`livenessProbe`	Probe used to determine if a pod should be restarted. See table below for all probe options
`readinessProbe`	Probe used to determine if the pod is ready to accept workloads. See table below for all probe options

Probes

Kubernetes uses probes to determine the state of a pod. Pods are not considered to have started up, and hence other probes are not run, until the startup probes have succeeded. Pods that fail the livenessProbe will be restarted and work will not be dispatched to the pod until the readinessProbe returns successfully. A pod is ready when all of its containers are ready.

Liveness and readiness probes share the same configuration options.

Parameter	Description
`enabled`	Enable/Disable the probe
`initialDelaySeconds`	How long to wait before starting the probe.
`periodSeconds`	How frequently Kubernetes with check the probe.
`failureThreshold`	The number of failures Kubernetes will retry the readiness probe before giving up.
`timeoutSeconds`	How long Kubernetes will wait for a probe to timeout.

Examples

jobHandlers:
  replicaCount: 2
  livenessProbe:
    enabled: false
  readinessProbe:
    enabled: true
    initialDelaySeconds: 300
    periodSecods: 30
    timeoutSeconds: 5
    failureThreshhold: 3

Extra File Mappings

The extraFileMappings field can be used to inject files to arbitrary paths in the nginx deployment, as well as any of the job, web, or workflow handlers, and the init jobs.

The contents of the file can be specified directly in the values.yml file with the content attribute.

The tpl flag will determine whether these contents are run through the helm templating engine.

Note: when running with tpl: true, brackets ({{ }}) not meant for Helm should be escaped. One way of escaping is: {{ '{{ mynon-helm-content}}' }}

extraFileMappings:
  /galaxy/server/static/welcome.html:
    applyToWeb: true
    applyToJob: false
    applyToWorkflow: false
    applyToNginx: true
    applyToSetupJob: false
    tpl: false
    content: |
      <!DOCTYPE html>
      <html>...</html>

NOTE for security reasons Helm will not load files from outside the chart so the path must be a relative path to location inside the chart directory. This will change when helm#3276 is resolved. In the interim files can be loaded from external locations by:

Creating a symbolic link in the chart directory to the external file, or
using --set-file to specify the contents of the file. E.g: helm upgrade --install galaxy cloudve/galaxy -n galaxy --set-file extraFileMappings."/galaxy/server/static/welcome\.html".content=/home/user/data/welcome.html

Setting parameters on the command line

Specify each parameter using the --set key=value[,key=value] argument to helm install or helm upgrade. For example,

helm install my-galaxy . --set persistence.size=50Gi

The above command sets the Galaxy persistent volume to 50GB.

Setting Galaxy configuration file values requires the key name to be escaped. In this example, we are upgrading an existing deployment.

helm upgrade my-galaxy . --set "configs.galaxy\.yml.brand"="Hello World"

You can also set the galaxy configuration file in its entirety with:

helm install my-galaxy . --set-file "configs.galaxy\.yml"=/path/to/local/galaxy.yml

To unset an existing file and revert to the container's default version:

helm upgrade my-galaxy . --set "configs.job_conf\.xml"=null

Alternatively, any number of YAML files that specifies the values of the parameters can be provided when installing the chart. For example,

helm install my-galaxy . -f values.yaml -f new-values.yaml

To unset a config file in a values file, use the YAML null type:

configs:
  job_conf.xml: ~

Data persistence

By default, the Galaxy handlers store all user data under /galaxy/server/database/ path in each container. This path can be changed via persistence.mountPath variable. Persistent Volume Claims (PVCs) are used to share the data across deployments. It is possible to specify en existing PVC via persistence.existingClaim. Alternatively, a value for persistence.storageClass can be supplied to designate a desired storage class for dynamic provisioning of the necessary PVCs. If neither value is supplied, the default storage class for the K8s cluster will be used.

For multi-node scenarios, we recommend a storage class that supports ReadWriteMany, such as the nfs-provisioner as the data must be available to all nodes in the cluster.

In single-node scenarios, you must use --set persistence.accessMode="ReadWriteOnce".

Note about persistent deployments and restarts

If you wish to make your deployment persistent or restartable (bring deployment down, keep the state in disk, then bring it up again later in time), you should create PVCs for Galaxy and Postgres and use the persistence.existingClaim variable to point to them as explained in the previous section. In addition, you must set the postgresql.galaxyDatabasePassword variable; otherwise, it will be autogenerated and will mismatch when restoring.

Production settings

Note that this deployment mode does not work on a Mac because of an unresolved issue in the CVMFS-CSI driver.

To install this configuration of the chart, we need to enable CVMFS deployment. Depending on the setup of the cluster you have available, you may also need to supply values for the cluster storage classes or PVCs.

If you wish to install a single Galaxy CVMFS-CSI and Postgres operator release to be used by multiple Galaxy releases, you can do so by installing the CVMFS separately as shown below:

helm repo add cloudve https://raw.githubusercontent.com/CloudVE/helm-charts/master/
helm repo add zalando https://raw.githubusercontent.com/zalando/postgres-operator/master/charts/postgres-operator/
helm repo update
kubectl create namespace psql
helm install psql-operator --namespace psql zalando/postgres-operator --set persistence.enabled=true
kubectl create namespace cvmfs
helm install galaxy-cvmfs --namespace cvmfs cloudve/galaxy-cvmfs-csi --set repositories.cvmfs-gxy-data="data.galaxyproject.org"
helm install galaxy cloudve/galaxy --set cvmfs.enabled=true --set cvmfs.deploy=false

If you wish to get a quick deployment of a single Galaxy instance with its own CVMFS-CSI, you can do so by enabling the CVMFS deployment as part of this chart:

helm repo add cloudve https://raw.githubusercontent.com/CloudVE/helm-charts/master/
helm repo update
helm install galaxy cloudve/galaxy --set cvmfs.enabled=true --set cvmfs.deploy=true

If you use the latter method, it is highly recommended that you deploy a single Galaxy release per nodepool/namespace, as multiple CVMFS-CSI provisioners and Postgres operator running side-by-side can interfer with one another.

Making Interactive Tools work on localhost

In general, Interactive Tools should work out of the box as long as you have a wildcard DNS mapping to *.its.<host_name>. To make Interactive Tools work on localhost, you can use dnsmasq or similar to handle wildcard DNS mappings for *.localhost.

For mac:

  $ brew install dnsmasq
  $ cp /usr/local/opt/dnsmasq/dnsmasq.conf.example /usr/local/etc/dnsmasq.conf
  $ edit /usr/local/etc/dnsmasq.conf and set

    address=/localhost/127.0.0.1

  $ sudo brew services start dnsmasq
  $ sudo mkdir /etc/resolver
  $ sudo touch /etc/resolver/localhost
  $ edit /etc/resolver/localhost and set

    nameserver 127.0.0.1

  $ sudo brew services restart dnsmasq

This should make all *.localhost and *.its.localhost map to 127.0.0.1, and ITs should work with a regular helm install on localhost.

Horizontal scaling

The Galaxy application can be horizontally scaled for the web, job, or workflow handlers by setting the desired values of the webHandlers.replicaCount, jobHandlers.replicaCount, and workflowHandlers.replicaCount configuration options.

Galaxy versions

Some changes introduced in the chart sometimes rely on changes in the Galaxy container image, especially in relation to the Kubernetes runner. This table keeps track of recommended Chart versions for particular Galaxy versions as breaking changes are introduced. Otherwise, the Galaxy image and chart should be independently upgrade-able. In other words, upgrading the Galaxy image from 21.05 to 21.09 should be a matter of helm upgrade mygalaxy cloudve/galaxy --reuse-values --set image.tag=21.09.

Chart version	Galaxy version	Description
`4.0`	`21.05`	Needs Galaxy PR#11899 for eliminating the CVMFS. If running chart 4.0+ with Galaxy image `21.01` or below, use the CVMFS instead with `--set setupJob.downloadToolConfs.enabled=false --set cvmfs.repositories.cvmfs-gxy-cloud=cloud.galaxyproject.org --set cvmfs.galaxyPersistentVolumeClaims.cloud.storage=1Gi --set cvmfs.galaxyPersistentVolumeClaims.cloud.storageClassName=cvmfs-gxy-cloud --set cvmfs.galaxyPersistentVolumeClaims.cloud.mountPath=/cvmfs/cloud.galaxyproject.org`

Funding

Version 3+: Galaxy Project, Genomics Virtual Laboratory (GVL)
Version 2: Genomics Virtual Laboratory (GVL), Galaxy Project, and European Commission (EC) H2020 Project PhenoMeNal, grant agreement number 654241.
Version 1: European Commission (EC) H2020 Project PhenoMeNal, grant agreement number 654241.

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

galaxyproject / galaxy-helm

Programming Languages