Overview

A workload represents a backend application such as a microservice. It is comprised of one or multiple containers. Containers making up the workload communicate freely on localhost.

Workloads run in Control Plane’s AWS, Azure, and GCP accounts, where clouds/regions are determined by the GVC definition. Your workload may run only in a single region of one cloud, or across many regions of all the three clouds – completely up to the GVC definition. Requests are routed to the nearest healthy location.

Workloads are managed using a common interface, regardless of cloud providers. Workload log data is consolidated for easy retrieval and analysis. It means that a particular workload can be operating on AWS, Azure and GCP, yet its log – across instances/providers is accessed using a single API/CLI/UI/Grafana operation.

Features

Auto Scaling

The number of workload replicas is automatically scaled up and down based on the workload’s scaling strategy.

Selectable Scaling Strategies:

  • Disabled
  • Concurrent Requests Quantity
  • Requests Per Second
  • Percentage of CPU Utilization
  • Request Latency

The minimum and maximum number of replicas that can be deployed are configurable. Workloads can be scaled down to 0 when there is no traffic and can scale up immediately to fulfill new request.

Capacity AI is not available if CPU Utilization is selected because dynamic allocation of CPU resources cannot be accomplished while scaling replicas based on the usage of its CPU.

Capacity AI

A workload can leverage intelligent allocation of its container’s resources (CPU and Memory) by using Capacity AI.

Capacity AI uses an analysis of historical usage to adjust these resources up to a configured maximum.

This approach can substantially reduce costs; however, it may result in temporary performance issues during sudden spikes in usage.

If capacity AI is disabled, the amount of resources configured will be fully allocated.

Location Override

By default, both Capacity AI and Auto Scaling settings are applied to all deployments at each location enabled in the GVC. However, these settings can be customized at each location to enhance performance for specific audiences.

This allows for granular control over how your workload scales in specific locations. For instance, if the majority of your users are in Europe, you can set the European locations to a higher level than the rest of the world.

Setting local options ensures that your target users are served quickly and helps reduce costs for unused resources.

Probes

Probes are a feature of Kubernetes that are used to control the health of an application running inside a container.

Each container can have a:

  • Readiness Probe

    • An endpoint is configured to allow queries, enabling you to check if the workload is available and ready to receive requests.
  • Liveness Probe

    • An endpoint is configured to allow queries, enabling you to check if the workload is healthy or if it needs to be restarted.

Alerts

Using Grafana, you can create alerts on any of the standard metrics exposed by Control Plane, or on your custom metrics. To access Grafana, navigate to one of your orgs in the Control Plane console and click the “Metrics” link.

You have the full capability of Grafana alerting at your disposal. For more information, please consult the Grafana documentation

Reference

Visit the workload reference page for additional information.

Types

  • Serverless:
    • Workloads that scale to zero when they aren’t receiving requests.
  • Standard:
    • Workloads that serve network traffic on multiple ports, but do not scale to zero.
  • Cron:
    • Workloads that run on a schedule, and do not serve network traffic.
  • Stateful:
    • Simlilar to a standard workload, stateful workloads have stable replica identities and hostnames, and can mount a volume set for persistent storage.