> ## Documentation Index
> Fetch the complete documentation index at: https://docs.controlplane.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Workloads

> Workloads are the primary deployment unit in Control Plane. Covers container configuration, autoscaling, Capacity AI, logging, and the five workload types.

## Overview

A workload represents a backend application such as a microservice. It is comprised of one or multiple containers. Containers within a workload communicate freely on `localhost`.

Workloads run in Control Plane AWS, Azure, and GCP accounts, or in your own [BYOK location](/reference/location#byok-locations). The [GVC](/reference/gvc) determines which providers and locations are available. A workload can run in one location or across multiple providers and regions, depending on the GVC and workload placement settings.

Workloads are managed using a common interface, regardless of cloud providers. Workload logs are consolidated across replicas, locations, and providers, and can be accessed through the API, CLI, Console, or Grafana.

## Features

* [Auto Scaling](#auto-scaling)
* DNS geo-routing
* [Capacity AI](#capacity-ai) - Intelligent resource management
* Load balancing
* [Location-specific override](#location-specific-override) of scaling and resource management
* Logging
* [Probes](#probes)
* [Alerts](#alerts)

## Auto Scaling

Workload replicas are automatically scaled up and down based on the selected scaling strategy.

Selectable Scaling Strategies:

* Disabled
* CPU Utilization
* Memory Utilization
* Concurrent Requests Quantity
* Requests Per Second
* Request Latency

See [Autoscaling](/reference/workload/autoscaling) for more information.

The minimum and maximum number of replicas that can be deployed are configurable. Scale to zero is available for [Serverless](/reference/workload/types#serverless) workloads using the `rps` or `concurrency` scaling strategies, and for [Standard](/reference/workload/types#standard) and [Stateful](/reference/workload/types#stateful) workloads when using KEDA. When the scale-to-zero condition is met, the workload can scale down to 0 and scale up immediately to fulfill new requests.

<Info>
  [Capacity AI](#capacity-ai) is not available if CPU Utilization is selected because dynamic allocation of CPU resources cannot be
  accomplished while scaling replicas based on the usage of its CPU. See [Capacity AI Restrictions](/reference/workload/capacity#caveats) for the full list.
</Info>

## Capacity AI

A workload can leverage intelligent allocation of its container's resources (CPU and memory) by using Capacity AI.

Capacity AI uses historical usage analysis to adjust these resources between configured minimum and maximum values.

This approach can substantially reduce costs; however, it may result in temporary performance issues during sudden spikes in usage.

Before enabling capacity AI on your workload, review the [Capacity AI reference page](/reference/workload/capacity).

## Location-specific Override

By default, both [Capacity AI](#capacity-ai) and [Auto Scaling](#auto-scaling) settings are applied to all deployments at each location enabled in the [GVC](/concepts/gvc). However, these settings can be customized per location to enhance performance for specific audiences.

This allows for granular control over how your workload scales in specific locations. For instance, if the majority of your users are in Europe, you can set the European locations to a higher level than the rest of the world.

Setting location-specific options ensures that your target users are served quickly and helps reduce costs for unused resources.

## Probes

A probe is a Kubernetes feature that is used to monitor the health of an application running in a container.

Each container can have a:

* Readiness Probe

  * An endpoint is configured to allow queries, enabling you to check if the workload is available and ready to receive requests.

* Liveness Probe
  * An endpoint is configured to allow queries, enabling you to check if the workload is healthy or needs to be restarted.

## Alerts

Using Grafana, you can create alerts on any of the standard metrics exposed by Control Plane, or on your [custom metrics](/reference/workload/custom-metrics). To access Grafana, navigate to one of your orgs in the Control Plane console and click the "Metrics" link.

You have full access to Grafana alerting capabilities. For more information, see the [Grafana documentation](https://grafana.com/docs/grafana/latest/alerting/).

## Inter-Workload Networking

Workloads are reachable by other workloads at `<workload-name>.<gvc-name>.cpln.local`, but inter-workload traffic is **denied by default**. Each receiving workload must opt in by setting [`firewallConfig.internal.inboundAllowType`](/reference/workload/firewall#internal) to one of:

* `same-gvc` — allow workloads in the same GVC.
* `same-org` — allow workloads anywhere in the same org.
* `workload-list` — allow specific workloads listed in `inboundAllowWorkload` (can span GVCs).

The default is `none`, which blocks all inter-workload traffic.

## Types

* **Serverless**:
  * Workloads that scale to zero when they aren't receiving requests.
* **Standard**:
  * Workloads serve network traffic on multiple ports and can scale to zero only when using KEDA.
* **Cron**:
  * Workloads that run on a schedule, and do not serve network traffic.
* **Stateful**:
  * Similar to a `standard` workload, `stateful` workloads have stable replica identities and hostnames, and can mount a [volume set](/reference/volumeset) for persistent storage.
* **VM**:
  * Run a full [virtual machine](/reference/workload/vm) — its own guest OS and kernel — as a workload, inside the same service mesh, identity, networking, and observability as containers.

## Reference

See the [workload reference](/reference/workload) page for additional information.
