When Termination Occurs
Workload termination typically happens in these scenarios:- Scaling down: When reducing the number of workload replicas
- Version updates: When replacing an old version with a new version after a successful deployment
- Capacity AI: Regular rollouts similar to version updates when Capacity AI is enabled
- Maintenance: Rare cases where maintenance activities require workload replicas to be rescheduled
Termination Process Overview
The following steps are performed for each workload replica. The shutdown process occurs simultaneously across all containers in a workload, with default configurations ensuring that in-flight requests complete and load balancers update before containers receive shutdown signals. Termination Grace Period: By default,spec.rolloutOptions.terminationGracePeriodSeconds
is set to 90
seconds. This controls the total time available for the workload replica to shutdown gracefully before all containers receive a SIGKILL signal.
1. Load Balancer Update
At the start of workload termination, load balancers receive a command to remove the workload replica from the pool. This update process typically takes a few seconds but can take up to 10 seconds. Once updated, new incoming requests are routed to the remaining healthy replicas.2. Workload Sidecar and Container Termination
The workload sidecar (managed by Control Plane) and all other workload containers receive commands to begin their termination process. This occurs nearly simultaneously with the load balancer update.Sidecar Termination Process
The Control Plane-managed sidecar shutdown process consists of three sequential phases: Hold → Monitoring → Drain.Phase 1: Sidecar Hold
The sidecar continues running normally for 80 seconds by default. This is calculated as 10 seconds less than the termination grace period, which defaults to 90 seconds. The termination grace period can be adjusted in the workload rollout options. As an example, if the termination grace period is reduced the 10 seconds or less, the sidecar hold period would be 0 seconds.Phase 2: Sidecar Monitoring
The Control Plane-managed sidecar continues monitoring inbound and outbound network activity from workload containers. The sidecar remains running and waits to drain until no active requests exist. This phase can continue until the termination grace period expires or until no more connections are found.Phase 3: Sidecar Drain
In the drain phase, the sidecar stops accepting new connections and once again verifies that all existing connections are complete. Once all connections are completed, the sidecar shuts down.Workload Container Termination Process
Default PreStop Hook
If no custom preStop hook is defined for workload containers, a default preStop hook is applied that executessh -c "sleep 45"
to pause the shutdown process. The sleep duration is set to half of the terminationGracePeriodSeconds
(default: 90 seconds = 45 seconds sleep). After the sleep completes, the container receives a SIGINT signal and has the remaining half of the termination grace period to shutdown gracefully before receiving a SIGKILL signal.
Missing sleep executable: If the
sleep
executable is not available in any of your workload containers, ALL the containers for the replica being shutdown will receive a SIGKILL termination signal immediately. Requests may still attempt to reach the containers and fail before the load balancer is fully updated.Custom PreStop Hook (Optional)
Consider implementing a custom preStop hook only if your workload requires specific termination logic, such as:- Ensuring connections are gracefully terminated
- Managing shutdown delays when
sh
andsleep
binaries are unavailable in the container - Implementing custom request handling during shutdown
terminationGracePeriodSeconds
(default: 90 seconds) allocated from the start of the shutdown process.
PreStop hook errors: If a custom preStop hook for any container throws an error, then ALL containers will immediately receive a SIGKILL signal.
Serverless workloads: For serverless workloads, the termination grace period is currently limited by the timeoutSeconds in workload options. This restriction will be removed in a future update.