Elasticsearch

Overview

Elasticsearch is a distributed search and analytics engine built on Apache Lucene. This template deploys a production-ready Elasticsearch 8.17.0 cluster with automatic master election, optional Kibana for visualization and management, and automated snapshot backups to AWS S3 or GCS via Elasticsearch’s built-in Snapshot Lifecycle Management (SLM).

This template does not create a GVC. You must deploy it into an existing GVC.

What Gets Created

Stateful Elasticsearch Workload — A multi-node Elasticsearch cluster. Each replica gets its own persistent volume so index data and shards survive restarts.
Volume Set — One persistent volume per replica for Elasticsearch data.
Standard Kibana Workload (optional, enabled by default) — The Elasticsearch web UI for querying data, managing indices, and monitoring cluster health.
Standard Backup Setup Workload (optional) — A one-time job that waits for the cluster to be healthy, registers the snapshot repository with Elasticsearch, and creates the SLM policy. Runs once and can be removed after initial setup.
Identity & Policy — An identity bound to the workloads with reveal access to the configuration secrets, and cloud storage access when backup is enabled.
Secrets — Configuration secrets for Elasticsearch and Kibana credentials and cluster settings.

Prerequisites

This template has no external prerequisites unless backup is enabled. To install, follow the instructions for your preferred method:

UI

Browse, install, and manage templates visually

CLI

Manage templates from your terminal

Terraform

Declare templates in your Terraform configurations

Pulumi

Declare templates in your Pulumi programs

Configuration

The default values.yaml for this template:

image: docker.elastic.co/elasticsearch/elasticsearch:8.17.0

replicas: 3 # Must be odd for master quorum (3, 5, 7, ...)

clusterName: my-elasticsearch-cluster

# JVM heap size per node — set to ~50% of maxMemory, hard cap at 30g
jvmHeap: 3g

resources:
  minCpu: 1
  minMemory: 2Gi
  maxCpu: 2
  maxMemory: 6Gi

volumeset:
  capacity: 10 # initial capacity in GiB (minimum is 10)
  autoscaling:
    enabled: false
    maxCapacity: 100
    minFreePercentage: 10
    scalingFactor: 1.2

multiZone:
  enabled: false # Set to true to schedule replicas across availability zones

internal_access:
  type: same-gvc # options: same-gvc, same-org, workload-list
  workloads:
    #- //gvc/GVC_NAME/workload/WORKLOAD_NAME

kibana:
  enabled: true
  image: docker.elastic.co/kibana/kibana:8.17.0
  resources:
    cpu: 500m
    memory: 2Gi
  internal_access:
    type: same-gvc # options: same-gvc, same-org, workload-list
    workloads:
      #- //gvc/GVC_NAME/workload/WORKLOAD_NAME

backup:
  enabled: false
  remove_setup_workload: false # Set to true after setup completes to reduce resource usage

  provider: aws # options: aws or gcp

  # Snapshot schedule in Quartz cron format (6 fields: seconds minutes hours day-of-month month day-of-week)
  schedule: "0 0 2 * * ?"

  retention:
    maxAge: 30d   # Delete snapshots older than this
    maxCount: 30  # Keep at most this many snapshots

  aws:
    bucket: my-s3-bucket
    region: us-east-1
    prefix: elasticsearch-snapshots
    cloudAccountName: my-cloud-account
    policyName: my-backup-policy

  gcp:
    bucket: my-gcs-bucket
    prefix: elasticsearch-snapshots
    cloudAccountName: my-cloud-account

Cluster Settings

replicas — Number of Elasticsearch nodes. Must be an odd number (3, 5, 7) to maintain a valid master election quorum.
clusterName — The Elasticsearch cluster name. Used internally by nodes to discover each other.
jvmHeap — JVM heap size per node. Set to approximately 50% of maxMemory. Hard cap at 30g.

A minimum of 3 replicas is required for master quorum. With 3 nodes, the cluster can survive the loss of 1 node. Always use an odd number — an even number of nodes does not improve fault tolerance and can cause split-brain scenarios.

Resources and Storage

resources.minCpu / resources.minMemory — Guaranteed minimum CPU and memory per node.
resources.maxCpu / resources.maxMemory — Maximum CPU and memory per node.
volumeset.capacity — Initial volume size in GiB per node (minimum 10).
volumeset.autoscaling.enabled — Automatically expand volumes as data grows.
volumeset.autoscaling.maxCapacity — Maximum volume size in GiB.
volumeset.autoscaling.minFreePercentage — Triggers a scale-up when free space falls below this percentage.
volumeset.autoscaling.scalingFactor — Multiplier applied to current capacity on each scale-up.

Multi-Zone

When multiZone.enabled: true, Control Plane spreads Elasticsearch replicas across availability zones within the location. This improves durability — if a zone goes down, remaining nodes in other zones maintain quorum. Verify your selected location supports multiple availability zones before enabling.

Internal Access

Controls which workloads can reach Elasticsearch and Kibana. Both have independent internal_access settings.

Type	Description
`same-gvc`	Allow access from all workloads in the same GVC (recommended)
`same-org`	Allow access from all workloads in the org
`workload-list`	Allow access only from specific workloads listed in `workloads`

Kibana

Kibana is enabled by default and provides a web UI for exploring data, managing indices, building dashboards, and monitoring cluster health.

kibana.enabled — Deploy the Kibana workload.
kibana.resources.cpu / kibana.resources.memory — CPU and memory for the Kibana container.
kibana.internal_access — Controls which workloads can reach Kibana (same options as internal_access).

Connecting

Both Elasticsearch and Kibana are accessible internally from within the same GVC. External access is blocked by default — use cpln port-forward to reach them from your local machine.

Service	Internal hostname	Port
Elasticsearch	`{release-name}-elasticsearch.{gvc-name}.cpln.local`	`9200`
Kibana	`{release-name}-kibana.{gvc-name}.cpln.local`	`5601`

Port-forward to Kibana:

cpln workload port-forward --gvc GVC_NAME RELEASE_NAME-kibana --port 5601
# Then open http://localhost:5601 in your browser

Port-forward to Elasticsearch:

cpln workload port-forward --gvc GVC_NAME RELEASE_NAME-elasticsearch --port 9200

Scaling

Scaling up — Increase replicas to the next odd number and run cpln helm upgrade. New nodes join the cluster automatically and Elasticsearch begins rebalancing shards.

Scale down is destructive. Elasticsearch does not automatically move shards off nodes that are about to be removed. Always take a manual snapshot before scaling down, and verify shard allocation with GET /_cat/shards?v before removing nodes to ensure no primary shards are stranded on the nodes being removed.

Backup

Elasticsearch backups are incremental snapshots stored directly in S3 or GCS via the built-in repository-s3 and repository-gcs plugins — no separate backup image is required. The Snapshot Lifecycle Management (SLM) feature handles scheduling and retention automatically. When backup is enabled, a one-time Backup Setup Workload runs at install time. It waits for the cluster to be healthy, then calls the Elasticsearch API to register the snapshot repository and create the SLM policy. Once it completes successfully, it can be removed to save resources:

backup:
  enabled: true
  remove_setup_workload: true

The backup schedule uses Quartz cron format with 6 fields (seconds, minutes, hours, day-of-month, month, day-of-week) — not the standard 5-field cron format. For example, "0 0 2 * * ?" runs daily at 2am UTC.

AWS S3 Prerequisites

Create an S3 bucket. Set backup.aws.bucket to its name and backup.aws.region to its region.
If you do not have a Cloud Account set up, refer to the docs to Create a Cloud Account. Set backup.aws.cloudAccountName to its name.
Create an IAM policy with the following JSON, replacing YOUR_BUCKET_NAME:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:ListBucket",
                "s3:GetObjectVersion",
                "s3:DeleteObjectVersion"
            ],
            "Resource": [
                "arn:aws:s3:::YOUR_BUCKET_NAME",
                "arn:aws:s3:::YOUR_BUCKET_NAME/*"
            ]
        }
    ]
}

Set backup.aws.policyName to the name of the policy created in step 3.

GCS Prerequisites

Create a GCS bucket. Set backup.gcp.bucket to its name.
If you do not have a Cloud Account set up, refer to the docs to Create a Cloud Account. Set backup.gcp.cloudAccountName to its name.
Add the Storage Admin role to the GCP service account associated with the Cloud Account.

Manual Snapshots

Exec into any Elasticsearch container to trigger a snapshot immediately or inspect status:

# Trigger a snapshot now via SLM
curl -X PUT 'http://localhost:9200/_slm/policy/automated-snapshots/_execute'

# List all snapshots in the repository
curl 'http://localhost:9200/_snapshot/backup-repo/_all?pretty'

# Check a snapshot currently in progress
curl 'http://localhost:9200/_snapshot/backup-repo/_current?pretty'

# View the SLM policy and last execution result
curl 'http://localhost:9200/_slm/policy/automated-snapshots?pretty'

Restoring a Snapshot

Exec into any Elasticsearch node in the cluster to run restore commands. The snapshot repository (backup-repo) is already registered. List available snapshots:

curl 'http://localhost:9200/_snapshot/backup-repo/_all?pretty'

Scenario 1 — Disaster Recovery (Fresh Cluster)

Deploy a new cluster from this template with backup enabled and pointing at the same bucket. Once the Backup Setup Workload completes (re-registering the same repository), restore all indices:

curl -X POST 'http://localhost:9200/_snapshot/backup-repo/SNAPSHOT_NAME/_restore' \
  -H 'Content-Type: application/json' \
  -d '{
    "indices": ["*", "-.internal.*", "-.slo-observability.*.temp", "-.ds-ilm-history*"],
    "ignore_unavailable": true,
    "include_global_state": false
  }'

Scenario 2 — Restore to Existing Cluster

When the target indices already exist, close them before restoring or the restore will be rejected:

# Close the index first
curl -X POST 'http://localhost:9200/MY_INDEX/_close'

# Restore from snapshot
curl -X POST 'http://localhost:9200/_snapshot/backup-repo/SNAPSHOT_NAME/_restore' \
  -H 'Content-Type: application/json' \
  -d '{
    "indices": "MY_INDEX",
    "ignore_unavailable": true,
    "include_global_state": false
  }'

Scenario 3 — Restore Specific Indices

curl -X POST 'http://localhost:9200/_snapshot/backup-repo/SNAPSHOT_NAME/_restore' \
  -H 'Content-Type: application/json' \
  -d '{
    "indices": "my-index,my-other-index-2026.05*",
    "ignore_unavailable": true,
    "include_global_state": false
  }'

Monitor restore progress:

# View active recovery operations
curl 'http://localhost:9200/_cat/recovery?v&active_only=true'

# Check overall cluster health
curl 'http://localhost:9200/_cluster/health?pretty'

Important Notes

Replica count must be odd — Elasticsearch requires an odd number of master-eligible nodes for quorum. Even numbers do not improve fault tolerance and can cause split-brain.
JVM heap — Set jvmHeap to approximately 50% of maxMemory, with a hard cap of 30g. Elasticsearch relies heavily on off-heap memory for the filesystem cache.
Backup schedule format — SLM uses Quartz cron format (6 fields), not the standard 5-field cron. The ? wildcard is required in the day-of-week field when day-of-month is set.
Scale down carefully — Always snapshot before reducing replicas. Verify no primary shards exist on nodes being removed before running the upgrade.

External References

Elasticsearch Documentation

Official Elasticsearch reference documentation

Kibana Documentation

Official Kibana guide

Snapshot and Restore

Elasticsearch snapshot and restore guide

Snapshot Lifecycle Management

Create, monitor, and delete snapshots with SLM

Elasticsearch Template

View the source files, default values, and chart definition

Template Catalog

Templates

Overview

What Gets Created

Prerequisites

UI

CLI

Terraform

Pulumi

Configuration

Cluster Settings

Resources and Storage

Multi-Zone

Internal Access

Kibana

Connecting

Scaling

Backup

AWS S3 Prerequisites

GCS Prerequisites

Manual Snapshots

Restoring a Snapshot

Scenario 1 — Disaster Recovery (Fresh Cluster)

Scenario 2 — Restore to Existing Cluster

Scenario 3 — Restore Specific Indices

Important Notes

External References

Elasticsearch Documentation

Kibana Documentation

Snapshot and Restore

Snapshot Lifecycle Management

Elasticsearch Template

​Overview

​What Gets Created

​Prerequisites

UI

CLI

Terraform

Pulumi

​Configuration

​Cluster Settings

​Resources and Storage

​Multi-Zone

​Internal Access

​Kibana

​Connecting

​Scaling

​Backup

​AWS S3 Prerequisites

​GCS Prerequisites

​Manual Snapshots

​Restoring a Snapshot

​Scenario 1 — Disaster Recovery (Fresh Cluster)

​Scenario 2 — Restore to Existing Cluster

​Scenario 3 — Restore Specific Indices

​Important Notes

​External References

Elasticsearch Documentation

Kibana Documentation

Snapshot and Restore

Snapshot Lifecycle Management

Elasticsearch Template

Overview

What Gets Created

Prerequisites

Configuration

Cluster Settings

Resources and Storage

Multi-Zone

Internal Access

Kibana

Connecting

Scaling

Backup

AWS S3 Prerequisites

GCS Prerequisites

Manual Snapshots

Restoring a Snapshot

Scenario 1 — Disaster Recovery (Fresh Cluster)

Scenario 2 — Restore to Existing Cluster

Scenario 3 — Restore Specific Indices

Important Notes

External References