Skip to main content

Overview

Weaviate is an AI-native vector database designed for storing, indexing, and querying vector embeddings alongside structured object data. This template deploys a multi-node Weaviate 1.38 cluster using Raft consensus for schema and cluster state management, with optional AI module support for vectorization and generative search, and optional scheduled backups to AWS S3 or GCS.

What Gets Created

  • Stateful Weaviate Workload — A multi-replica cluster with one persistent volume per node for vector and object data.
  • Volume Set — Persistent storage per replica with optional autoscaling.
  • Cron Backup Workload (optional) — Triggers Weaviate’s built-in backup API on a schedule to write full snapshots to cloud storage.
  • Identity & Policy — An identity bound to the workload with reveal access to credential secrets, and cloud storage access when backup is enabled.
  • Secrets — An opaque secret holding the API key, user, and all AI module API keys.
This template does not create a GVC. You must deploy it into an existing GVC.

Prerequisites

This template has no external prerequisites unless backup is enabled. To install, follow the instructions for your preferred method:

UI

Browse, install, and manage templates visually

CLI

Manage templates from your terminal

Terraform

Declare templates in your Terraform configurations
Pulumi Icon Streamline Icon: https://streamlinehq.com

Pulumi

Declare templates in your Pulumi programs

Configuration

The default values.yaml for this template:
replicas: 3

image: semitechnologies/weaviate:1.38.0

clusterName: my-weaviate

# IMPORTANT: Change all credentials before deploying to production
apiKey: changeme           # Bearer token for authenticating with Weaviate
apiUser: admin@example.com # Username associated with the API key

queryDefaultsLimit: 25

# Default vectorizer applied to new collections
# Options: none, text2vec-openai, text2vec-cohere, text2vec-huggingface, text2vec-aws
defaultVectorizerModule: none

modules:
  enabled: []
  # - generative-anthropic
  # - generative-openai
  # - generative-cohere
  # - text2vec-openai
  # - text2vec-cohere
  # - text2vec-huggingface

  openai:
    apiKey: ""
  anthropic:
    apiKey: ""
  cohere:
    apiKey: ""
  huggingface:
    apiKey: ""

cpu: 2
memory: 4Gi

volumes:
  data:
    initialCapacity: 20  # GiB
    autoscaling:
      maxCapacity: 200
      minFreePercentage: 20
      scalingFactor: 1.5

multiZone:
  enabled: false

internal_access:
  type: same-gvc  # Options: same-gvc, same-org, workload-list
  workloads:
    # - //gvc/GVC_NAME/workload/WORKLOAD_NAME

backup:
  enabled: false
  provider: aws   # options: aws or gcp
  schedule: "0 2 * * *"  # daily at 2am UTC

  resources:
    cpu: 250m
    memory: 256Mi

  aws:
    bucket: my-backup-bucket
    region: us-east-1
    cloudAccountName: my-s3-cloudaccount
    policyName: my-backup-policy
    path: weaviate/backups

  gcp:
    bucket: my-backup-bucket
    cloudAccountName: my-gcs-cloudaccount
    path: weaviate/backups

Core Settings

  • replicas — Number of Weaviate nodes. A minimum of 3 is recommended for production — the Raft consensus layer requires a quorum (2 of 3) to elect a leader and process schema changes.
  • clusterName — Internal cluster identifier used in Raft coordination.
  • apiKey — Bearer token that controls all access to the Weaviate instance, including schema management and data. Change before deploying to production.
  • apiUser — Username associated with the API key, used for authorization and the admin list.
  • queryDefaultsLimit — Default result limit applied to queries that do not specify one.
  • defaultVectorizerModule — The vectorizer Weaviate calls automatically on insert and query for new collections. Set to none if you are providing your own vectors.
  • cpu / memory — Resource limits applied to each Weaviate replica.

Storage

  • volumes.data.initialCapacity — Initial volume size in GiB per replica (minimum 10). Defaults to 20.
  • volumes.data.autoscaling.maxCapacity — Maximum volume size in GiB when autoscaling is enabled.
  • volumes.data.autoscaling.minFreePercentage — Percentage of free space that triggers a scale-up.
  • volumes.data.autoscaling.scalingFactor — Multiplier applied to the current capacity when scaling up.

Multi-Zone

Set multiZone.enabled: true to spread replicas across availability zones within the location. Verify your selected location supports multi-zone before enabling.

Firewall

  • internal_access.type — Controls which workloads can reach Weaviate:
    • same-gvc — All workloads in the same GVC (default).
    • same-org — All workloads in the org.
    • workload-list — Only workloads listed in internal_access.workloads.

AI Modules

Weaviate supports pluggable AI modules for automatic vectorization and generative search. Modules are disabled by default.
Adding an API key alone does not activate a module. Every module you intend to use must also be listed in modules.enabled.

Supported Modules

ModuleTypeProviderAPI Key Field
text2vec-openaiVectorizerOpenAImodules.openai.apiKey
text2vec-cohereVectorizerCoheremodules.cohere.apiKey
text2vec-huggingfaceVectorizerHugging Facemodules.huggingface.apiKey
generative-openaiGenerativeOpenAImodules.openai.apiKey
generative-anthropicGenerativeAnthropicmodules.anthropic.apiKey
generative-cohereGenerativeCoheremodules.cohere.apiKey

Example — Automatic Vectorization with OpenAI

defaultVectorizerModule: text2vec-openai

modules:
  enabled:
    - text2vec-openai
  openai:
    apiKey: "sk-..."
With this configuration, Weaviate automatically calls OpenAI’s embedding API when objects are inserted and when vector searches are run — no client-side embedding step required.

Example — Generative Search with Anthropic

modules:
  enabled:
    - generative-anthropic
  anthropic:
    apiKey: "sk-ant-..."
Enabling any module with an API key automatically adds outbound internet access to the Weaviate workload’s firewall so it can reach the provider’s API.

Connecting

Weaviate is accessible internally from any workload in the same GVC:
AccessHostnamePort
Load-balanced (any replica){release-name}-weaviate.{gvc}.cpln.local8080 (REST), 50051 (gRPC)
Specific replica{release-name}-weaviate-{n}.{gvc}.cpln.local8080 (REST), 50051 (gRPC)
Authenticate using the Bearer token set in apiKey:
curl -H "Authorization: Bearer YOUR_API_KEY" \
     http://{release-name}-weaviate.{gvc}.cpln.local:8080/v1/meta

Backing Up

When enabled, a cron workload triggers Weaviate’s built-in backup API on schedule to write a full snapshot to cloud storage. Each backup is stored at {path}/{backup-id}/ and includes all collections and their data.

AWS S3

1

Create a bucket

Create an S3 bucket. Set backup.aws.bucket and backup.aws.region to match.
2

Set up a Cloud Account

If you do not have one, create a Cloud Account. Set backup.aws.cloudAccountName to its name.
3

Create an IAM policy

Create an IAM policy with the following JSON (replace YOUR_BUCKET_NAME) and set backup.aws.policyName to its name:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:ListBucket",
                "s3:GetObjectVersion",
                "s3:DeleteObjectVersion"
            ],
            "Resource": [
                "arn:aws:s3:::YOUR_BUCKET_NAME",
                "arn:aws:s3:::YOUR_BUCKET_NAME/*"
            ]
        }
    ]
}

GCS

1

Create a bucket

Create a GCS bucket. Set backup.gcp.bucket to its name.
2

Set up a Cloud Account

If you do not have one, create a Cloud Account. Set backup.gcp.cloudAccountName to its name.
You must add the Storage Admin role to the GCP service account created for the Cloud Account.

Restoring a Backup

Exec into any Weaviate replica and POST to the restore endpoint. Replace s3 with gcs for GCP backups, and BACKUP_ID with the backup name from your bucket (e.g. weaviate-backup-20260610-020000):
# Trigger restore
wget -qO- \
  --header='Authorization: Bearer YOUR_API_KEY' \
  --header='Content-Type: application/json' \
  --post-data='{}' \
  'http://localhost:8080/v1/backups/s3/BACKUP_ID/restore'

# Poll for completion
wget -qO- \
  --header='Authorization: Bearer YOUR_API_KEY' \
  'http://localhost:8080/v1/backups/s3/BACKUP_ID/restore'
Restore will fail if a collection from the backup already exists on the cluster. Drop existing collections first or restore to a fresh deployment.

Important Notes

  • Minimum 3 replicas for production — Raft requires a quorum (2 of 3) to elect a leader and process schema changes. A 2-node cluster cannot reach quorum if one node fails.
  • API key security — Change apiKey before deploying to production. This key controls all access to the instance including schema management and data.
  • Modules must be declared — Adding an API key alone does not activate a module. Every module you intend to use must be listed in modules.enabled.
  • Multi-zone — Verify your selected location supports multi-zone before enabling.

External References

Weaviate Documentation

Official Weaviate documentation

Weaviate REST API

REST API reference including backup and restore endpoints

Weaviate Modules

Documentation for vectorizer and generative AI modules

Weaviate GraphQL API

GraphQL API reference for querying vector and object data