Refer to the Workload concepts page.
Refer to the Create a Workload guide for additional details.
Refer to the CLI documentation for workloads.
Displays the permissions granted to principals for the workload.
Workload auto-scaling is configured by setting a strategy, a target value, and in some cases as metric percentile. Together these values determine when the workload will scale up & down.
As the system scales up, traffic will not be sent to the new replicas until they pass the readiness probe, if configured. If there is no probe configured or if it is a basic TCP port check, the requests will hit the new replicas before they are ready to respond. This could cause a delay or errors for end-user traffic.
You can configure autoscaling in the default options for a workload (defaultOptions
) and in any of the location-specific options
The scaling strategy is set using autoscaling.metric
.
concurrency
)(requests * requestDuration)/(timePeriod * replicas)
.(1000 * .05)/(1 * 5) = 10
.rps
)cpu
)latency
)Caveats when choosing a workload type and a scaling strategy:
latency
scaling strategyconcurrency
scaling strategyThe scale to zero functionality is only available for Serverless workloads, and only when using the rps
or concurrency
scaling strategies
For standard workloads, Control Plane runs two asynchronous control loops:
Because of this asynchronous structure, autoscaling decisions may be made based on a metric value that is as old as the metric's collection rate (usually 20 seconds).
A workload's scale is evaluated every 15 seconds, using the value most recently calculated by the metric calculation loop. Each time an evaluation is made the chosen metric is averaged across all available replicas and compared against the scale target. When scaling up, Control Plane does not enforce a stabilization window; the number of pods will increase as soon as the scaling algorithm dictates. When scaling down, a stabilization window of 5 minutes is used; the highest number of pods recommended by the scaling algorithm within the past 5 minutes will be applied to the running workload.
Every 20 seconds, Control Plane calculates the average number of requests per second over the past 60 seconds.
Every 20 seconds, Control Plane calculates latency, using the response time of the workload once requests are received, using an average over the past 60 seconds at the specified percentile (p50, p75, p99).
Every 15 seconds, Control Plane calculates the average CPU usage over the past 15 seconds.
The current capacity is evaluated every 2 seconds and compared against the scale target. It averages requests completed over the previous 60 seconds to avoid rapid changes. If ever a scaling decision is made which results in a scale increase above 200% then it suspends scale down decisions and averages over 6 seconds for 60 seconds. This is to allow for rapid scaling when a burst of traffic is detected.
latency
Scaling Strategy {#autoscaling-special-consideration}Because request latency is represented as a distribution, when using the latency
scaling strategy, you must choose a metric percentile by setting the autoscaling.metricPercentile
property to one of the following values:
p50
p75
p99
autoscaling.minScale
)Maximum Scale
inclusive).autoscaling.maxScale
)autoscaling.scaleToZeroDelay
)autoscaling.maxConcurrency
)autoscaling.metricPercentile
)latency
scaling strategy. The default value is p50
.Capacity AI is not available if CPU Utilization is selected because dynamic allocation of CPU resources cannot be accomplished while scaling replicas based on the usage of its CPU.
Workloads can leverage intelligent allocation of its container's resources (CPU and Memory) by using Capacity AI.
Capacity AI uses an analysis of historical usage to adjust these resources up to a configured maximum.
This can significantly reduce cost, but may cause temporary performance issues with sudden spikes in usage.
If capacity AI is disabled, the amount of resources configured will be fully allocated.
Capacity AI must be disabled if the autoscaling strategy is set to CPU Utilization
.
Changes made to a workload will reset its historical usage and will restart the analysis process.
When resources are not being used, Capacity AI will downscale CPU usage to a minimum of 25 millicores. The minimum will increase depending on the memory size being recommended by Capacity AI using a 1:3 ratio of CPU millicores to memory MiB.
The container entrypoint can be overridden by entering a custom command value.
Custom command line arguments can be sent to the container during deployment.
These arguments will be appended to the image ENTRYPOINT
.
The argument list is ordered and will be passed to the container in the same order.
A specific replica of a workload can be connected to (similar to exec
) from either the console or the CLI. This can be used for
troubleshooting any issues with the replica.
To connect using the console, click the Connect
link from a workload. Select the location, container, replica, and command.
Click Connect
to execute the command. By default, the bash
shell will be executed.
To connect using the CLI, review the workload connect subcommand.
Workloads must have at least one container configured with the following:
If a workload has more than one container, only one can serve traffic.
The ports listed below are blocked and are not allowed to be used.
Containers which attempt to use these ports will not be able to bind.
8012, 8022, 9090, 9091, 15000, 15001, 15006, 15020, 15021, 15090, 41000
The following rules apply to the name of a container:
cpln_
.In order to see detailed routing for the global georouted endpoint of a workload, debug values can be included within the response headers of a workload's endpoint request.
The values will only be returned when:
debug
is active and the header x-cpln-debug: true
is in the request.Using the console, debug can be activated by:
Options
.Debug
switch to on
.Save
.After the workload redeploys, the response from the workload's endpoint will contain the following headers if the
header x-cpln-debug: true
is in the request:
x-cpln-location
: Location of the responding replica.x-cpln-replica
: Name of the responding replica.Sample Request Headers:
copyGET https://doc-test-v39red0.cpln.app/ HTTP/1.1Host: doc-test-v39red0.cpln.appConnection: keep-alivex-cpln-debug: true
Sample Response Headers:
copyHTTP/1.1 200 OKcontent-length: 2993content-type: text/plaindate: Fri, 10 Sep 2021 21:34:27 GMTx-envoy-upstream-service-time: 2x-cpln-location: aws-us-west-2x-cpln-replica: doc-test-00083-deployment-75584b7d66-f8wtb
This URL is globally load-balanced and TLS terminated. This can be used for testing if there is an issue with the custom domain that is associated with the GVC.
Within each deployment, a location specific URL is available that can be used for testing how your app is responding from a specific location of a GVC.
Additional globally load-balanced endpoints will show in the workload status for each domain route that is configured to use this workload.
Custom environment variables can be made available to the image running within a container.
The value of the variable can be in plain text or a secret value.
The length of an environment variable value cannot be greater than 4096 characters.
Each workload has the following built-in environment variables:
Variable Name | Description | Format |
---|---|---|
CPLN_GLOBAL_ENDPOINT | The canonical Host header that the container will receive requests on | ${workloadName}-${gvcAlias}.cpln.app |
CPLN_GVC | The Global Virtual Cloud (GVC) the container is running under | string |
CPLN_GVC_ALIAS | The Global Virtual Cloud Alias | 13 digit alphanumeric value |
CPLN_LOCATION | The location the container is serving the request from | aws-us-west-2, azure-eastus2, gcp-us-east1, etc. |
CPLN_NAMESPACE | The namespace of the container | Generated random string (e.g., aenhg2ec6pywt) |
CPLN_PROVIDER | The cloud provider the container is serving the request from | aws, azure, gcp, etc. |
CPLN_ORG | The org the container is running under | string |
CPLN_WORKLOAD | The workload the container is running under | string |
CPLN_WORKLOAD_VERSION | The Control Plane version of the Workload, only updated when needed to apply changes. For example, changing scaling settings will not cause this to change. | numeric |
CPLN_TOKEN | An token used to authenticate to the Control Plane CLI / API | Random authorization token |
CPLN_IMAGE | The image as defined for this container in the Control Plane api | string |
Since a Workload Identity can be the target of a Policy, a running Workload can be authorized to exercise the Control Plane CLI or API without any additional authentication.
Examples:
Direct call to the Control Plane API:
curl ${CPLN_ENDPOINT}/org/${CPLN_ORG} -H "Authorization: ${CPLN_TOKEN}"
If the Control Plane CLI installed:
cpln org get ${CPLN_ORG}
The value of CPLN_TOKEN is valid only if the request originates from the Workload it is injected in. If it is used from
another Workload or externally, a 403 Forbidden
response will be returned.
TIP: If a Workload is not assigned an Identity, it can still GET
its parent Org.
Sensitive values can be used as an environment variable by using a secret.
The identity of the workload must be member of a policy that has the reveal
permissions on the secret.
When adding an environment variable using the UI, a list of available secrets can be accessed by pressing Control-S within the value textbox.
If you do not have any secrets defined, the prefix cpln://secret/
will be inserted.
The following variable names are not allowed to be used as a custom environment variable:
K_SERVICE
K_CONFIGURATION
K_REVISION
The PORT
environment variable is provided at runtime and available to a container.
It can be assigned as a custom environment variable in all cases except when the container is exposed and the value doesn't match that of the exposed port.
For example:
3000
:3000
.3000
.A .env file can be uploaded using the console to import multiple environment variables. Secret values are supported.
Sample .env file:
copyURL=http://test.example.comUSERNAME=user001PASSWORD=cpln://secret/username_secret.passwordDATA=cpln://secret/opaque_secret.payload
Environment variables may be set at the GVC level. These variables are available to any container running in the GVC
on an opt-in basis. To opt in, set the container's inheritEnv
property to true
. You can override the value of an
inherited variable by adding a local variable with the same key.
Inbound network access is only available for workloads of types serverless
and standard
. For other workload types, only outbound firewall settings are relevant.
The external firewall is used to control Internet traffic to/from a workload.
Inbound Requests:
The CIDR address 0.0.0.0/0
allows full inbound access from the public Internet.
Outbound Requests:
The CIDR address 0.0.0.0/0
allows full outbound access to the public Internet.
The internal firewall is used to control access between other workloads within an org.
Available Options:
None
: No access is allowed between workloads.Same GVC
: Workloads running in the same GVC are accessible.Same Org
: Workloads running in the same org are accessible.Specific Workloads
: Specific workloads are allowed access this workload.view
permission, set within a policy, on the workload being specified.Allow to Access Itself
: Enables replicas of this workload to access themselves.Refer to the identities page for additional details.
Each workload must be configured with at least one container, associated with an image.
Images can be pulled from:
A public registry
Org's private registry
/org/ORG_NAME/image/IMAGE_NAME:TAG
//image/IMAGE_NAME:TAG
Each Workload container can be configured to execute a subset of the Kubernetes lifecycle hooks. The supported hooks are:
This hook is executed immediately after a container is created. However, there is no guarantee that the hook will execute before the container ENTRYPOINT. In the event of a failure, the relevant error message will be recorded in the corresponding deployment.
This hook is executed immediately before a container is stopped. In the event of a failure, the relevant error message will be recorded in the workload's event log
These hooks can be configured using the console or cpln apply.
Using the console
Lifecycle
link from the top menu bar.Save
.Using cpln apply
Only the exec
type is supported.
Example:
Add the lifecycle
section to an existing workload container.
copyspec:containers:- name: advanced-options-exampleargs: []cpu: 50menv: []image: '//image/IMAGE:TAG'memory: 128Miport: 8080lifecycle:postStart:exec:command:- sh- '-c'- sleep 10preStop:exec:command:- sh- '-c'- sleep 10
Workload logs are consolidated from all the deployed locations and can be viewed using the UI or CLI.
Using the UI, the logs page will be prefilled with the LogQL query for the workload and GVC name.
Example LogQL Querycopy{gvc="test-gvc, workload="test-workload"}
Logs can be further filtered by:
Date
Location
Container
Grafana can be used to view the logs by clicking the Explore on Grafana
link within the console.
Refer to the logs page for additional details.
A Secret can be mapped as a read-only file by using a Volume.
During the configuration of a Volume using the console, the Secret reference (e.g., cpln://secret/SECRET_NAME
) can be
entered manually or Control-S
can be pressed to view and select the available Secrets.
The Path must be a unique absolute path and, optionally, a file name (e.g., /secret/my-secret.txt) depending on the secret type. This path will be added to the container's file system and will be accessible by the running application.
NOTE: A maximum of 15 volumes can be added.
The secret type will dictate how the secret will be mounted to the file system.
.payload
property is not required.Base64 decode at Runtime
checkbox when configuring the secret.payload
(e.g., /path/payload).___cpln___.secret
.___cpln___.secret
file.All other Secret Types:
___cpln___.secret
. The contents of this file will be the JSON formatted output of the secret.A Workload that is configured with a Volume that references a Secret must be configured with an Identity bound to a policy having the reveal permission.
Control Plane can collect custom metrics from your workload by having your application emit a Prometheus formatted list of metrics at a path and port of your choosing. The port can be different than the one serving traffic. Each container in a workload can be configured with metrics.
The convention is to use the path /metrics
, but any path can be used.
Sample output from the metrics endpoint:
copyMY_COUNTER 788MY_COUNTER_2 123NUM_USERS 2NUM_ORDERS 91
The platform will scrape all the replicas in the workload every 30 seconds with a 5 second timeout. Metric names
with the prefix cpln_
will be ignored by the scrapping process.
The collected metrics can be viewed by clicking the Metrics
link on the workload page within the console. Clear any existing query and
enter the name of the metric. Click Run Query
to execute.
The time-series displayed will include these labels:
org
gvc
location
provider
region
cluster_id
replica
The permissions below are used to define policies together with one or more of the four principal types:
Permission | Description | Implies |
---|---|---|
connect | Connect to replica (open an interactive shell) | |
create | Create new workloads | |
delete | Delete existing workloads | |
edit | Modify existing workloads | view |
manage | Full access | connect, create, delete, edit, manage, view |
view | Read-only access |
Probes are used to check the health of an application running inside a container.
The readiness probe is used to determine if the workload replica is ready to receive traffic. For example, if the application is performing some actions during start-up and needs it to complete before serving requests, the readiness probe should fail until the actions have been completed.
This check is used in two ways.
Determines if replicas from a new version of the workload are ready, when the check passes the rollout continues, when the check fails the rollout is paused.
Determines if the workload replica should receive new requests from end users. When the rediiness probe is failing the replica is removed the pool of available replicas for this workload and all endpoints.
It is recommended to use an HTTP or Command probes that perform an adequate check that the workload is healthy and able to respond to requests.
The liveness probe defines when the container should be restarted.
For example, if the application code hits a deadlock condition, the liveness probe can catch that the container is not healthy, and Control Plane will restart the failing workload replica. This will ensure that the application is available as much as possible until the defect causing the deadlock is fixed.
Health Check Type:
Configurable Limits:
Unready
. (must be between 1 and 20 inclusive, default is 3).Refer to the Kubernetes probe documentation here for additional details.
Settings to control the rollout process between versions.
copyspec:rolloutOptions:minReadySeconds: 0maxUnavailableReplicas: 1maxSurgeReplicas: 100%
The minimum number of seconds that a workload replica must be running before the rollout progresses.
The maximum number or percentage of replicas that can be unavailable during a rollout or during regular rescheduling of workloads.
The maximum number or percentage of new replicas that can added during a rollout for each batch.
Example: If there are 4 running replicas and maxSurgeReplicas is set to 50%, then during each rollout 2 replicas will be added at the new version. Once they are healthy as determined by the ReadinessProbe, the rollout will continue, -2 old replicas, +2 new replicas, -2 old replicas.
In cases where a short rollout cutover is needed, a maxSurgeReplicas setting of 100%
is recommended.
Settings to control the security of the container at runtime.
copyspec:securityOptions:filesystemGroupId: 777
Any mounted Volumes for this container will be owned by the group id provided. When not specified 0
(root) is used.
The CPU and Memory of a container is configurable. Select appropriate values for each container.
If capacity AI is enabled, these values will be the maximum that the container could be provisioned with.
If capacity AI is disabled, these values will be reserved when the container is provisioned.
Resource Type | Default Unit | Default Value | Min | Max |
---|---|---|---|---|
CPU | Millicores | 50 | 0 | 8000 |
Memory | MiB | 128 | 1 | 8192 |
Memory Units:
You can express memory as a plain integer or as a fixed-point number
using one of these suffixes: E
, P
, T
, G
, M
, k
.
You can also use the power-of-two equivalents: Ei
, Pi
, Ti
, Gi
, Mi
, Ki
.
For example, the following represent roughly the same value:
128974848, 129e6, 129M, 123Mi
Refer to the Kubernetes Meaning of Memory reference page for additional details.
The ratio between CPU to Memory can be at most 1/8.
For example: If memory is set to 512Mi, CPU must be at least 64 Millicores.
Each workload replica receives 1GB of local ephemeral solid state drive (SSD) storage.
If the replica uses more than 1GB, it will be replaced with a new, fresh replica.
Each workload can be suspended which immediatly stops the workload from serving traffic. This is the same as setting the min/max scale to 0. When the workload is unsuspended, it will resume serving traffic.
To temporarily deactivate a workload choose Stop
from the Actions menu.
copyspec:defaultOptions:suspend: true
The workload will stop running and will not serve any traffic.
To reactivate the workload, choose Start
from the actions menu.
copyspec:defaultOptions:suspend: false
The maximum request duration in seconds before Control Plane will timeout. This timeout amount can be reached when Control Plane is waiting for the workload to respond or when waiting for a new workload to become available when using Autoscaling.
The minimum value is 1 second and the maximum value is 600 seconds.
Serverless | Standard | Cron | |
---|---|---|---|
Allow multiple containers | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
Scale to Zero | :heavy_check_mark: | ||
Must expose one HTTP port | :heavy_check_mark: | ||
Allow no exposed ports | :heavy_check_mark: | :heavy_check_mark: | |
Allow multiple exposed ports | :heavy_check_mark: | ||
Unable to expose any ports | :heavy_check_mark: | ||
Custom Domain requests have the HOST header of the custom domain | :heavy_check_mark: | ||
Fast switching update between versions | :heavy_check_mark: | ||
Rolling update between versions | :heavy_check_mark: | ||
Autoscale by CPU | :heavy_check_mark: | :heavy_check_mark: | |
Autoscale by requests per second | :heavy_check_mark: | :heavy_check_mark: | |
Autoscale by concurrent requests | :heavy_check_mark: | ||
Autoscale by request latency | :heavy_check_mark: | ||
Runs on a schedule and is expected to complete | :heavy_check_mark: |
Serverless workloads should be used for web applications that serve traffic on a single port, but may not need to run 100% of the time.
Serverless workloads may:
Serverless workloads may not:
Serverless workloads must:
Standard workloads have greater flexibility in network exposure, but may not scale to zero.
Standard workloads may:
Standard workloads may not:
Cron workloads should be used when you need to perform a background task on a regular schedule.
Cron workloads may not:
Cron workloads must:
Cron workloads are always deployed to all locations within their GVC. Unlike workloads of other types, there is no way to provide location-specific configuration overrides.
job.schedule
job.concurrencyPolicy
Forbid
or Replace
. This determines what Control Plane will do when a prior execution of your workload is still running when the next scheduled execution time arrives.Forbid
: subsequent executions will be forgone until the running execution completes.Replace
: the running execution will be stopped so that a new execution can begin.job.historyLimit
job.suspend
true
will disable future executions.job.restartPolicy
Never
or OnFailure
. This determines whether your workload will be restarted when it fails on execution.job.activeDeadlineSeconds
A cron workload retains up to job.historyLimit
job executions in its history. Each job execution will be in one of the following statuses:
The Removed status indicates that a job execution was deleted before it could finish execution. There are several reasons this can happen, but the most common are:
job.concurrencyPolicy
is Replace
and while the job was still executing, job.Schedule
dictated that the job should begin again.job.activeDeadlineSeconds
limit was exceeded.Cloud Object and File storage, ephemeral scratch storage and Secrets can be mounted to directories of containers at runtime by adding one or more volumes.
A volume consists of a uri
and a mount path
. The uri
is prefixed with the provider scheme followed by the bucket/storage name
(e.g., s3://my-s3-bucket). The mount path
must be a unique absolute path (e.g., /s3-files). This path will be added to the
container's file system and accessible by the running application.
During the set up of a volume using the console, the uri
name can be entered manually or an
existing Cloud Account can assist looking up the name.
The identity of the workload is used to authenticate to the provider's cloud storage API, or used for authorization to access the Control Plane secret. A Cloud Account for each cloud storage provider, with the necessary access/roles, must exist and be associated with the workload identity.
Volumes can be shared between containers of the same workload. For example if two containers in a workload are each configured with the volume uri: 'scratch://volume1', path: '/my/shared/data'
then changes to files in /my/shared/data
will be visible to both containers.
NOTE: A maximum of 15 volumes can be added.
Volume Provider | URI Scheme | Mode | Example |
---|---|---|---|
CPLN Secret | cpln://secret | read-only | cpln://secret/secretname |
AWS S3 | s3:// | read-only | s3://my-s3-bucket |
Google Cloud Storage | gs:// | read-only | gs://my-google-bucket |
Azure Blob Storage | azureblob:// | read-only | azureblob://my-azure-account/container |
Azure Files | azurefs:// | read-write | azurefs://my-azure-account/my-files |
Scratch (emptyDir) | scratch:// | read-write, ephemeral | scratch://volume1 |
To allow a workload identity the ability to authenticate to an object store, a cloud access rule must be created for each provider. A Cloud Account for each provider must exists in order to create the cloud access rule.
The following list contains the minimum roles/scopes that must be added to a cloud access rule:
S3 (using an AWS Cloud Account)
Create a new AWS role with existing policies
and choose AmazonS3ReadOnlyAccess
.Google Cloud Storage (using a Google Cloud Account)
Create a new GCP service account
.Select bucket name
.Storage Legacy Bucket Reader
and Storage Legacy Object Reader
.Storage Admin
role.Azure Blob Storage and Files (using an Azure Cloud Account)
Select storage account
.Storage Blob Data Reader
.To allow a Workload access to the object stores, the outbound requests of its external firewall
must either be set to All Outbound Requests Allowed
or the hostnames listed below for the corresponding object store must
be added to the Outbound Hostname Allow List
.
AWScopy*.amazonaws.com
Azure Blobcopy*.blob.core.windows.net*.azure.com
Azure Filecopy*.file.core.windows.net*.azure.com
GCPcopy*.googleapis.com
/dev
/dev/log
/tmp
/var
/var/log
The CPU, memory and egress used for mounted object stores are billed to the workload. To review the costs of mounting an object store,
query the container named cpln-mounter
for the workload within the metrics page.
The container working directory can be overridden by entering a custom directory. The value must be an absolute path.
Loading
Healthy
Unhealthy
Deleting
Unknown