Overview
The KubeVirt add-on lets an mk8s cluster run VM workloads (spec.type: vm). It installs the KubeVirt and CDI (Containerized Data Importer) operators and configures them for Control Plane: VM components are scoped to a dedicated VM node pool, the feature gates the platform needs are enabled, and boot-disk imports are wired through CDI.
Once the add-on is enabled and at least one VM-capable node is present, the location reports vm in its capabilities and VM workloads can be scheduled there.
Supported providers
Any provider whose nodes can expose hardware virtualization (/dev/kvm) — that means bare-metal nodes or instances with nested virtualization enabled. On AWS, nested virtualization is available on 8th-generation Intel instance types (c8i, m8i, r8i and variants); bare-metal (.metal) instances also work. On generic / BYOK clusters, use bare-metal hosts or VMs that pass through KVM.
Requirements
Setting up a cluster for VMs has four parts:A node pool with hardware virtualization
VM nodes must be able to run KVM. On AWS, enable nested virtualization on a supported instance type via
cpuOptions.nestedVirtualization; elsewhere use bare-metal / KVM-capable hosts.Label (and ideally taint) the VM node pool
KubeVirt’s components and the VMs themselves are scoped to nodes labelled
cpln.io/nodeType: vm. Add that label to the node pool. A matching taint cpln.io/nodeType=vm:NoSchedule is recommended to keep ordinary pods off the VM nodes — KubeVirt’s VM placement tolerates it.Enable the kubevirt add-on
Add
kubevirt to addOns. This installs the KubeVirt + CDI operators and their custom resources.Provide storage for boot disks
VM boot disks are imported by CDI into a PVC (backed by a volume set), so the cluster needs a working CSI / default
StorageClass. If your default StorageClass is block-mode, set scratchSpaceStorageClass to a filesystem-mode class so CDI has scratch space for imports.The dedicated
cpln.io/nodeType: vm node pool is what makes a node VM-capable. Without a labelled node, the KubeVirt components have nowhere to run and no VM workload can schedule.How to enable
Cluster manifest (AWS example)
A dedicated VM node pool with nested virtualization, the required label and taint, plus the add-on:YAML
/dev/kvm; nested-virtualization flags are provider-specific.
Console
When creating or editing the cluster, open Add-ons, toggle on KubeVirt, and make sure you have a node pool labelledcpln.io/nodeType: vm on VM-capable instances.
Configuration
| Option | Description |
|---|---|
scratchSpaceStorageClass | Filesystem-mode StorageClass CDI uses for import scratch space. Required when the cluster default StorageClass is block-mode; otherwise optional. |
YAML
DataVolumes, BlockVolume, and Sidecar feature gates, and the VM node placement — so you don’t set these yourself.
DNS
VM guests reach in-cluster services through a platform DNS forwarder. ThenodeLocalDns add-on (a per-node CoreDNS cache) is recommended alongside KubeVirt for reliable VM DNS. It is not strictly required — VM cloud-init falls back to TCP DNS (options use-vc) — but enabling it avoids edge cases.
Verify
After the add-on reconciles and a VM node is ready:- The cluster’s location reports
vmunder its workload-type capabilities. - A VM workload targeting this location schedules, imports its boot disk (watch
Importing boot disk: NN%in the deployment status), and boots.
Console access
For day-to-day access, use the platform paths documented on the VM workload page — they work the same on your own cluster and require nokubectl:
- SSH (Linux):
cpln workload connect/cpln workload exec. - RDP (Windows) or any TCP service:
cpln port-forwardto the guest port, then connect tolocalhost.
virtctl against the cluster:
Find the VM instance and its namespace
VM instances are labelled with their workload name and run in the GVC namespace:
virtctl must match the cluster’s KubeVirt version, and your kubeconfig user needs access to the virtualmachineinstances/vnc subresource. The serial console (boot log, login banner) is also available as virtctl console <vmi-name> -n <gvc-namespace>.Troubleshooting
A VM workload never starts and no instance appears
A VM workload never starts and no instance appears
Symptom: The workload deploys but stays not-ready, and If the list is empty, add the label (and the matching taint) to a node pool on KVM-capable instances as shown in How to enable, and verify the
kubectl get vmi -A -l cpln/workload=<workload-name> returns nothing — or the VM instance is stuck Pending/Scheduling. The location may not report vm in its capabilities.Cause: No VM-capable node, or KubeVirt’s components have nowhere to run. VMs and the KubeVirt control plane are scoped to nodes labelled cpln.io/nodeType: vm.Fix: Confirm at least one ready, labelled node exists:kubevirt add-on has reconciled.Boot disk import is stuck
Boot disk import is stuck
Symptom: The deployment status sits at Inspect the importer pod for the underlying error:
Importing boot disk: NN% and never reaches running.Cause: CDI imports the boot disk into a PVC before the VM can boot. Imports stall when there is no working default StorageClass, or when the default StorageClass is block-mode and CDI has no filesystem-mode scratch space.Fix: Ensure a default StorageClass exists. If it is block-mode, set scratchSpaceStorageClass to a filesystem-mode class:YAML
Windows VM fails to boot (UEFI)
Windows VM fails to boot (UEFI)
Symptom: The VM never reaches the OS; the VNC/serial console shows a UEFI error such as
failed to load Boot0001 ... from PciRoot(0x0).Cause: Control Plane runs these VMs with non-persistent EFI NVRAM, so the firmware boots with an empty variable store. The image must place the Windows boot manager at the UEFI fallback path (\EFI\BOOT\BOOTX64.EFI); without it the firmware has nothing to boot. Enabling Secure Boot causes the same class of failure because it requires persistent NVRAM.Fix: Rebuild the Windows image with the fallback bootloader (see Publishing & converting VM images). Do not set firmware.secureBoot — it is not yet available for this reason.Raw TCP to another workload over the mesh resets (HTTP works)
Raw TCP to another workload over the mesh resets (HTTP works)
Symptom: A VM (often Windows) reaches another workload’s internal endpoint fine over HTTP, but a plain TCP protocol (SQL, RDP, custom TCP) connects only intermittently and otherwise resets.Cause: Istio’s DNS proxy auto-allocates a virtual IP per ServiceEntry that can resolve to a non-serving endpoint; HTTP re-routes by
Host header, raw TCP does not.Fix: Enable the MESH_DISABLE_IP_AUTOALLOCATE actuator setting for that location, then reconcile the workload and re-establish the connection.Can't connect to the VM from my machine
Can't connect to the VM from my machine
Symptom: SSH or RDP to the VM times out from outside the cluster.Cause: A VM has no public endpoint by default; its Then connect your client to
ports only govern in-cluster service-mesh traffic.Fix: Reach the guest with cpln port-forward — it tunnels directly to a replica and the target port does not need to be listed in the workload’s ports:localhost:13389. For public HTTP access, attach a domain instead.VM can't resolve cpln.local or reach other workloads
VM can't resolve cpln.local or reach other workloads
Symptom: Inside the guest,
*.cpln.local names or cross-location peers fail to resolve.Cause: VM guests resolve in-cluster names through a platform DNS forwarder; large answers can be truncated over UDP on some guests.Fix: Enable the nodeLocalDns add-on (recommended alongside KubeVirt). On Linux guests you can also force TCP DNS by adding options use-vc to /etc/resolv.conf from cloud-init.Next steps
VM Workloads
Configure and run a VM
Publishing VM Images
Build boot images
Volume Sets
Storage for boot and data disks
mk8s Overview
Managed Kubernetes basics