Skip to main content

Overview

The KubeVirt add-on lets an mk8s cluster run VM workloads (spec.type: vm). It installs the KubeVirt and CDI (Containerized Data Importer) operators and configures them for Control Plane: VM components are scoped to a dedicated VM node pool, the feature gates the platform needs are enabled, and boot-disk imports are wired through CDI. Once the add-on is enabled and at least one VM-capable node is present, the location reports vm in its capabilities and VM workloads can be scheduled there.

Supported providers

Any provider whose nodes can expose hardware virtualization (/dev/kvm) — that means bare-metal nodes or instances with nested virtualization enabled. On AWS, nested virtualization is available on 8th-generation Intel instance types (c8i, m8i, r8i and variants); bare-metal (.metal) instances also work. On generic / BYOK clusters, use bare-metal hosts or VMs that pass through KVM.

Requirements

Setting up a cluster for VMs has four parts:
1

A node pool with hardware virtualization

VM nodes must be able to run KVM. On AWS, enable nested virtualization on a supported instance type via cpuOptions.nestedVirtualization; elsewhere use bare-metal / KVM-capable hosts.
2

Label (and ideally taint) the VM node pool

KubeVirt’s components and the VMs themselves are scoped to nodes labelled cpln.io/nodeType: vm. Add that label to the node pool. A matching taint cpln.io/nodeType=vm:NoSchedule is recommended to keep ordinary pods off the VM nodes — KubeVirt’s VM placement tolerates it.
3

Enable the kubevirt add-on

Add kubevirt to addOns. This installs the KubeVirt + CDI operators and their custom resources.
4

Provide storage for boot disks

VM boot disks are imported by CDI into a PVC (backed by a volume set), so the cluster needs a working CSI / default StorageClass. If your default StorageClass is block-mode, set scratchSpaceStorageClass to a filesystem-mode class so CDI has scratch space for imports.
The dedicated cpln.io/nodeType: vm node pool is what makes a node VM-capable. Without a labelled node, the KubeVirt components have nowhere to run and no VM workload can schedule.

How to enable

Cluster manifest (AWS example)

A dedicated VM node pool with nested virtualization, the required label and taint, plus the add-on:
YAML
spec:
  provider:
    aws:
      # ...existing provider config...
      nodePools:
        - name: vm
          instanceTypes:
            - c8i.xlarge          # 8th-gen Intel — supports nested virtualization
          cpuOptions:
            nestedVirtualization: true
          labels:
            cpln.io/nodeType: vm
          taints:
            - key: cpln.io/nodeType
              value: vm
              effect: NoSchedule
          minSize: 1
          maxSize: 4
          bootDiskSize: 100
          subnetIds:
            - ${SUBNET_1}
  addOns:
    kubevirt: {}
    nodeLocalDns: {}            # recommended (see DNS note below)
On generic / BYOK clusters, label and taint the node pool the same way and ensure the hosts expose /dev/kvm; nested-virtualization flags are provider-specific.

Console

When creating or editing the cluster, open Add-ons, toggle on KubeVirt, and make sure you have a node pool labelled cpln.io/nodeType: vm on VM-capable instances.

Configuration

OptionDescription
scratchSpaceStorageClassFilesystem-mode StorageClass CDI uses for import scratch space. Required when the cluster default StorageClass is block-mode; otherwise optional.
YAML
addOns:
  kubevirt:
    scratchSpaceStorageClass: my-filesystem-sc
The add-on also configures the KubeVirt CR automatically — the DataVolumes, BlockVolume, and Sidecar feature gates, and the VM node placement — so you don’t set these yourself.

DNS

VM guests reach in-cluster services through a platform DNS forwarder. The nodeLocalDns add-on (a per-node CoreDNS cache) is recommended alongside KubeVirt for reliable VM DNS. It is not strictly required — VM cloud-init falls back to TCP DNS (options use-vc) — but enabling it avoids edge cases.

Verify

After the add-on reconciles and a VM node is ready:
  • The cluster’s location reports vm under its workload-type capabilities.
  • A VM workload targeting this location schedules, imports its boot disk (watch Importing boot disk: NN% in the deployment status), and boots.

Console access

For day-to-day access, use the platform paths documented on the VM workload page — they work the same on your own cluster and require no kubectl:
  • SSH (Linux): cpln workload connect / cpln workload exec.
  • RDP (Windows) or any TCP service: cpln port-forward to the guest port, then connect to localhost.
When a guest will not boot far enough to accept SSH or RDP — a kernel panic, a UEFI failure, or a stalled first boot — you need the raw graphical console. On a cluster you operate (mk8s or BYOK) you have direct access, so use KubeVirt’s virtctl against the cluster:
1

Find the VM instance and its namespace

VM instances are labelled with their workload name and run in the GVC namespace:
kubectl get vmi -A -l cpln/workload=<workload-name>
2

Open the VNC console

Point virtctl at the instance (use the NAME and NAMESPACE from the previous step). virtctl launches your local VNC viewer:
virtctl vnc <vmi-name> -n <gvc-namespace> --kubeconfig <cluster-kubeconfig>
If you have no local VNC client, run with --proxy-only; virtctl prints a local 127.0.0.1:<port> address you can point any VNC client at.
virtctl must match the cluster’s KubeVirt version, and your kubeconfig user needs access to the virtualmachineinstances/vnc subresource. The serial console (boot log, login banner) is also available as virtctl console <vmi-name> -n <gvc-namespace>.

Troubleshooting

Symptom: The workload deploys but stays not-ready, and kubectl get vmi -A -l cpln/workload=<workload-name> returns nothing — or the VM instance is stuck Pending/Scheduling. The location may not report vm in its capabilities.Cause: No VM-capable node, or KubeVirt’s components have nowhere to run. VMs and the KubeVirt control plane are scoped to nodes labelled cpln.io/nodeType: vm.Fix: Confirm at least one ready, labelled node exists:
kubectl get nodes -l cpln.io/nodeType=vm
If the list is empty, add the label (and the matching taint) to a node pool on KVM-capable instances as shown in How to enable, and verify the kubevirt add-on has reconciled.
Symptom: The deployment status sits at Importing boot disk: NN% and never reaches running.Cause: CDI imports the boot disk into a PVC before the VM can boot. Imports stall when there is no working default StorageClass, or when the default StorageClass is block-mode and CDI has no filesystem-mode scratch space.Fix: Ensure a default StorageClass exists. If it is block-mode, set scratchSpaceStorageClass to a filesystem-mode class:
YAML
addOns:
  kubevirt:
    scratchSpaceStorageClass: my-filesystem-sc
Inspect the importer pod for the underlying error:
kubectl logs -n <gvc-namespace> -l app=containerized-data-importer
Symptom: The VM never reaches the OS; the VNC/serial console shows a UEFI error such as failed to load Boot0001 ... from PciRoot(0x0).Cause: Control Plane runs these VMs with non-persistent EFI NVRAM, so the firmware boots with an empty variable store. The image must place the Windows boot manager at the UEFI fallback path (\EFI\BOOT\BOOTX64.EFI); without it the firmware has nothing to boot. Enabling Secure Boot causes the same class of failure because it requires persistent NVRAM.Fix: Rebuild the Windows image with the fallback bootloader (see Publishing & converting VM images). Do not set firmware.secureBoot — it is not yet available for this reason.
Symptom: A VM (often Windows) reaches another workload’s internal endpoint fine over HTTP, but a plain TCP protocol (SQL, RDP, custom TCP) connects only intermittently and otherwise resets.Cause: Istio’s DNS proxy auto-allocates a virtual IP per ServiceEntry that can resolve to a non-serving endpoint; HTTP re-routes by Host header, raw TCP does not.Fix: Enable the MESH_DISABLE_IP_AUTOALLOCATE actuator setting for that location, then reconcile the workload and re-establish the connection.
Symptom: SSH or RDP to the VM times out from outside the cluster.Cause: A VM has no public endpoint by default; its ports only govern in-cluster service-mesh traffic.Fix: Reach the guest with cpln port-forward — it tunnels directly to a replica and the target port does not need to be listed in the workload’s ports:
cpln port-forward <workload-name> 13389:3389 --gvc <gvc> --location <location>
Then connect your client to localhost:13389. For public HTTP access, attach a domain instead.
Symptom: Inside the guest, *.cpln.local names or cross-location peers fail to resolve.Cause: VM guests resolve in-cluster names through a platform DNS forwarder; large answers can be truncated over UDP on some guests.Fix: Enable the nodeLocalDns add-on (recommended alongside KubeVirt). On Linux guests you can also force TCP DNS by adding options use-vc to /etc/resolv.conf from cloud-init.

Next steps

VM Workloads

Configure and run a VM

Publishing VM Images

Build boot images

Volume Sets

Storage for boot and data disks

mk8s Overview

Managed Kubernetes basics