Cassandra

Overview

Apache Cassandra is a distributed NoSQL database designed for high availability and linear scalability. This template deploys a Cassandra 5.0 cluster in a single location where each node owns a slice of the token ring and replicates data to peers according to the configured replication factor. Optional scheduled backups and periodic anti-entropy repair are included.

This template does not create a GVC. You must deploy it into an existing GVC.

What Gets Created

Stateful Cassandra Workload — A multi-node Cassandra cluster. Each replica gets its own persistent volume so SSTable data survives restarts.
Volume Set — One persistent volume per replica for Cassandra data.
Identity & Policy — An identity bound to the workload with reveal access to the credential secrets, and cloud storage access when backup is enabled.
Secrets — An opaque secret for the superuser password and a dictionary secret for the application user credentials.
Cron Backup Workload (optional) — When backup.type is logical, a standalone cron workload exports keyspace data as CSVs and uploads them to cloud storage.
Sidecar Backup Container (optional) — When backup.type is physical, a sidecar runs on each Cassandra replica, takes SSTable snapshots with nodetool snapshot, and syncs them to cloud storage.
Repair Cron Workload (optional, enabled by default) — Runs nodetool repair on a schedule to keep data consistent across replicas.

Prerequisites

This template has no external prerequisites unless backup is enabled. To install, follow the instructions for your preferred method:

UI

Browse, install, and manage templates visually

CLI

Manage templates from your terminal

Terraform

Declare templates in your Terraform configurations

Pulumi

Declare templates in your Pulumi programs

Configuration

The default values.yaml for this template:

replicas: 3
# replicationFactor must not exceed replicas
replicationFactor: 1

# IMPORTANT: Change all credentials before deploying to production
superuserPassword: supersecretpassword
username: username
password: password
keyspaceName: mydatabase

image: cassandra:5.0
cpu: 1
memory: 4Gi
# JVM heap: leave ~50% of container memory for off-heap (bloom filters, page cache, etc.)
# Cassandra 5.x uses G1GC — only MAX_HEAP_SIZE is valid; HEAP_NEWSIZE is ignored.
jvmHeapSize: 2G
clusterName: my-cassandra

volumes:
  data:
    initialCapacity: 10
    autoscaling:
      maxCapacity: 100
      minFreePercentage: 20
      scalingFactor: 1.5

multiZone:
  enabled: false

internal_access:
  type: same-gvc  # Options: same-gvc, same-org, workload-list
  workloads:
    #- //gvc/GVC_NAME/workload/WORKLOAD_NAME

backup:
  enabled: false
  type: logical   # options: logical, physical
  image: ghcr.io/controlplane-com/backup-images/cassandra-backup:5.0
  schedule: "0 2 * * *"   # daily at 2am UTC

  resources:
    cpu: 250m
    memory: 256Mi

  provider: aws   # options: aws, gcp

  aws:
    bucket: my-backup-bucket
    region: us-east-1
    cloudAccountName: my-backup-cloudaccount
    policyName: my-s3-policy
    prefix: cassandra/backups

  gcp:
    bucket: my-backup-bucket
    cloudAccountName: my-cloud-account
    prefix: cassandra/backups

repair:
  enabled: true
  # Cron schedule for full cluster repair (must run within gc_grace_seconds = 10 days)
  schedule: "0 2 * * 0"

Replicas and Replication Factor

These are two separate settings that work together:

replicas — how many Cassandra nodes are deployed. More nodes means more capacity and better throughput, as the token ring is split across more nodes.
replicationFactor — how many copies of each partition are stored across the cluster. A replication factor of 3 means every row exists on 3 different nodes, so the cluster can survive 2 node failures without data loss (with QUORUM consistency).

replicationFactor must not exceed replicas — you cannot store 3 copies of data across only 2 nodes.

For production, use at least 3 replicas with a replication factor of 3. This allows the cluster to survive a node failure while still achieving quorum.

Resources and Storage

cpu / memory — CPU and memory allocated to each Cassandra node.
jvmHeapSize — Set to approximately 50% of memory. Cassandra relies heavily on off-heap memory for bloom filters, row cache, and OS page cache.
volumes.data.initialCapacity — Initial volume size in GiB per node (minimum 10).
volumes.data.autoscaling.maxCapacity — Maximum volume size in GiB.
volumes.data.autoscaling.minFreePercentage — Triggers a scale-up when free space falls below this percentage.
volumes.data.autoscaling.scalingFactor — Multiplier applied to current capacity on each scale-up.

Multi-Zone

When multiZone.enabled: true, Control Plane spreads replicas across availability zones within the location. With a replication factor of 3 across 3 zones, each zone holds one copy of every partition — the cluster survives a complete zone outage with no data loss when using LOCAL_QUORUM consistency. Verify your selected location supports multi-zone before enabling this option.

Internal Access

Controls which workloads can reach the Cassandra cluster:

Type	Description
`same-gvc`	Allow access from all workloads in the same GVC (recommended)
`same-org`	Allow access from all workloads in the org
`workload-list`	Allow access only from specific workloads listed in `workloads`

Connecting

Each Cassandra replica is individually addressable. Provide multiple node hostnames as contact points in your application so it can discover the full cluster topology:

Host:     {release-name}-cassandra-0.{gvc-name}.cpln.local
          {release-name}-cassandra-1.{gvc-name}.cpln.local
          {release-name}-cassandra-2.{gvc-name}.cpln.local
Port:     9042
Username: {username}
Password: {password}
Keyspace: {keyspaceName}

Repair

Cassandra uses eventual consistency — when nodes miss writes during downtime, data can drift out of sync. nodetool repair runs an anti-entropy process that compares and reconciles data across all replicas. Repair must complete across all nodes at least once within gc_grace_seconds (default: 10 days) to prevent deleted data from reappearing after a node recovers.

repair.enabled — Enable the scheduled repair job (recommended: true).
repair.schedule — Cron expression for repair frequency. The default weekly schedule satisfies the 10-day gc_grace_seconds requirement with margin.

Do not disable repair in production or increase the interval beyond 10 days. Repair can be resource-intensive on large datasets — consider running it during low-traffic windows.

Backup

Two backup modes are available:

Mode	How it works	Best for
`logical`	Exports tables as CSVs using `cqlsh COPY TO`, uploads to cloud storage. Runs as a standalone cron workload.	Smaller datasets, portability
`physical`	Creates SSTable snapshots with `nodetool snapshot`, syncs to cloud storage. Runs as a sidecar on each Cassandra replica.	Large datasets, faster backup/restore

Set backup.enabled: true, set backup.type, and fill in the cloud storage block for your provider.

AWS S3 Prerequisites

Create an S3 bucket. Set backup.aws.bucket to its name and backup.aws.region to its region.
If you do not have a Cloud Account set up, refer to the docs to Create a Cloud Account. Set backup.aws.cloudAccountName to its name.
Create an IAM policy with the following JSON, replacing YOUR_BUCKET_NAME:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:ListBucket",
                "s3:GetObjectVersion",
                "s3:DeleteObjectVersion"
            ],
            "Resource": [
                "arn:aws:s3:::YOUR_BUCKET_NAME",
                "arn:aws:s3:::YOUR_BUCKET_NAME/*"
            ]
        }
    ]
}

Set backup.aws.policyName to the name of the policy created in step 3.

GCS Prerequisites

Create a GCS bucket. Set backup.gcp.bucket to its name.
If you do not have a Cloud Account set up, refer to the docs to Create a Cloud Account. Set backup.gcp.cloudAccountName to its name.
Add the Storage Admin role to the GCP service account associated with the Cloud Account.

Restoring a Backup

Logical Restore

Exec into the backup cron workload and run restore.sh with the timestamp of the backup to restore:

RESTORE_TIMESTAMP=2026-05-15T02-00-00Z /usr/local/bin/restore.sh

The timestamp matches the backup folder name in your bucket (e.g. cassandra/backups/2026-05-15T02-00-00Z/). The script downloads the CSVs and replays them into Cassandra using cqlsh COPY FROM. Existing rows with matching primary keys are overwritten; rows not in the backup are left in place.

Physical Restore

Physical backups are per-node — each replica backed up its own SSTable slice. Exec into the backup sidecar container on each replica and run:

RESTORE_TIMESTAMP=2026-05-15T02-00-00Z /usr/local/bin/restore.sh

The script downloads snapshot files for that replica, writes them to the shared volume, and calls nodetool import to load the SSTables without a restart.

Repeat this on every replica. Because each node owns a different token range, restoring only one replica leaves the cluster with incomplete data.

Important Notes

Scaling up — Adding replicas after initial deployment does not automatically rebalance data. Run nodetool rebuild on new nodes and nodetool cleanup on existing nodes after scaling.
JVM heap — Set jvmHeapSize to approximately 50% of memory. Cassandra relies on off-heap memory for bloom filters, row cache, and OS page cache.
gc_grace_seconds — The default is 10 days. Ensure repair runs at least once within this window on all nodes, or deleted data may reappear after a node recovers from downtime.
GVC naming — This template does not create a GVC. Deploy it into an existing GVC. If you run multiple Cassandra clusters in the same org, give each a distinct clusterName.

External References

Cassandra Documentation

Official Apache Cassandra documentation

Cassandra Driver Matrix

Client drivers for connecting to Cassandra

Cassandra Template

View the source files, default values, and chart definition

Template Catalog

Templates

Overview

What Gets Created

Prerequisites

UI

CLI

Terraform

Pulumi

Configuration

Replicas and Replication Factor

Resources and Storage

Multi-Zone

Internal Access

Connecting

Repair

Backup

AWS S3 Prerequisites

GCS Prerequisites

Restoring a Backup

Logical Restore

Physical Restore

Important Notes

External References

Cassandra Documentation

Cassandra Driver Matrix

Cassandra Template

Template Catalog

Templates

Documentation Index

​Overview

​What Gets Created

​Prerequisites

UI

CLI

Terraform

Pulumi

​Configuration

​Replicas and Replication Factor

​Resources and Storage

​Multi-Zone

​Internal Access

​Connecting

​Repair

​Backup

​AWS S3 Prerequisites

​GCS Prerequisites

​Restoring a Backup

​Logical Restore

​Physical Restore

​Important Notes

​External References

Cassandra Documentation

Cassandra Driver Matrix

Cassandra Template

Overview

What Gets Created

Prerequisites

Configuration

Replicas and Replication Factor

Resources and Storage

Multi-Zone

Internal Access

Connecting

Repair

Backup

AWS S3 Prerequisites

GCS Prerequisites

Restoring a Backup

Logical Restore

Physical Restore

Important Notes

External References