Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is nested VMX virtualization in the Linux kernel really that stable?

The technical details are a lot more complex than most realize.

Single level VMX virtualization is relatively straightforward even if there are a lot of details to juggle with VMCS setup and handing exits.

Nested virtualization is a whole another animal as one now also has to handle not just the levels but many things the hardware normally does, plus juggling internal state during transitions between levels.

The LKML is filled with discussions and debates where very sharp contributors are trying to make sense of how it would work.

Amazon turning the feature on is one thing. It working 100% perfectly is quite another…



Fair concern, but this has been quietly production-stable on GCP and Azure since 2017 — that's 8+ years at cloud scale. The LKML debates you're referencing are mostly about edge cases in exotic VMX features (nested APIC virtualization, SGX passthrough), not the core nesting path that workloads like Firecracker and Kata actually exercise.

The more interesting signal is that AWS is restricting this to 8th-gen Intel instances only (c8i/m8i/r8i). They're likely leveraging specific microarchitectural improvements in those chips for VMCS shadowing — picking the hardware generation where they can guarantee their reliability bar rather than enabling it broadly and dealing with errata on older silicon. That's actually the careful engineering approach you'd want from a cloud provider.


Curiously 8th Intel is also about the minimum for Windows 11… (which can virtualize most of the kernel)


It's been around for almost 15 years and stable enough for several providers to roll it out in production the past 10 years (GCP and Azure in 2017).

AWS is just late to the game because they've rolled so much of their own stack instead of adapting open source solutions and contributing back to them.


> AWS is just late to the game because they've rolled so much of their own stack instead of adapting open source solutions and contributing back to them.

This is emphatically not true. Contributing to KVM and the kernel (which AWS does anyway) would not have accelerated the availability.

EC2 is not just a data center with commodity equipment. They have customer demands for security and performance that far exceed what one can build with a pile of OSS, to the extent that they build their own compute and networking hardware. They even have CPU and other hardware SKUs not available to the general public.


As do all the other cloud providers, that have had this for years. like GCP and Azure, for 9 years now.


Architecturally they’re all quite different.

If my sources are correct, GCP did not launch on dedicated hardware like EC2 did, which raised customer concerns about isolation guarantees. (Not sure if that’s still the case.) And Azure didn’t have hardware-assisted I/O virtualization ("Azure Boost") until just a few years ago and it's not as mature as Nitro.

Even today, Azure doesn’t support nested virtualization the way one might ordinarily expect them to. It's only supported with Hyper-V on the guest, i.e., Windows.


Nested virtualisation with KVM works on the Linux GitHub Actions runners which I believe run on Azure.


GitHub says:

> While nested virtualization is technically possible while using runners, it is not officially supported. Any use of nested VMs is experimental and done at your own risk, we offer no guarantees regarding stability, performance, or compatibility.

https://docs.github.com/en/actions/concepts/runners/github-h...


It seems to work for my https://github.com/libriscv/kvmserver tests at least.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: