Vmscape and Why Xen Dodged It
Posted3 months agoActive3 months ago
virtualize.shTechstory
calmmixed
Debate
60/100
XenVirtualizationSecurityVmscape
Key topics
Xen
Virtualization
Security
Vmscape
The article discusses how Xen dodged the VMScape vulnerability, sparking a discussion on the technical details and implications of the vulnerability, as well as the architecture of Xen and other virtualization technologies.
Snapshot generated from the HN discussion
Discussion Activity
Active discussionFirst comment
1h
Peak period
13
0-6h
Avg / period
4.8
Comment distribution38 data points
Loading chart...
Based on 38 loaded comments
Key moments
- 01Story posted
Sep 28, 2025 at 2:19 PM EDT
3 months ago
Step 01 - 02First comment
Sep 28, 2025 at 3:24 PM EDT
1h after posting
Step 02 - 03Peak activity
13 comments in 0-6h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 30, 2025 at 5:41 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45406573Type: storyLast synced: 11/20/2025, 5:11:42 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
https://tornadovps.com/about
The founders literally wrote the book on xen:
https://nostarch.com/releases/xen.html
Citrix involvement has subsided in meantime and the ecosystem is much healthier (governance is actually under Linux Foundation), but the damage was done.
Xen to this day lacks in features, also.
If you have a spare machine or feel like picking up a tiny form factor i5 PC, you can play with Zen and Xen Orchestra fairly easily.
I once ran a three-node cluster of TinyFormFactor PCs running Xen, and it was A good framework for learning. The only reason I moved away from it is that the TFF PCs only had one gigabit Ethernet port and were limited to 32GB of RAM. I moved to more traditional small desktop PCs so I could add multiple 10-gigabit Ethernet interfaces and RAM.
Someday, I'll write up how I did an easy DMZ with XCP-ng.
* management stuff mostly lives in Dom0
* Xen does the flushes to protect VMs from each other
If you didn't do the first, then attacks on the host might work, and if you didn't do the second then attacks on Dom0 might work, but the combination blocks both vectors. Is that about right?
I guess it’s sort of off topic, but I was enjoying reading this until I got to the “That’s not just elegant — it’s a big deal for security” line that smelled like LLM-generated content.
Maybe that reaction is hypocritical. I like LLMs; I use them every day for coding and writing. I just can’t shake the feeling that I’ve somehow been swindled if the author didn’t care enough to edit out the “obvious” LLM tells.
[0]: https://comsec-files.ethz.ch/papers/vmscape_sp26.pdf
(The irony of opening with this pattern is not lost on me.)
As an aside, Wikipedia has a fascinating document identifying common "tells" for LLM-generated content:
https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing
I also tend to use a lot of em dashes. If I posted something I wrote in, say, 2010, I'd likely get a lot of comments about my writing absolutely, 100% being AI-written. I have posted old writing snippets in the past year and gotten this exact reaction.
I originally (two decades ago) started using em dashes, I think, because I also tend to go off on frequent tangents or want to add additional context, and at the beginning of the tangent, I'm not entirely sure how I'll phrase it. So, instead of figuring out the best punctuation at that moment (be that a parenthesis, a comma, or a semicolon for a list), I'll just type an em dash (easy on a Mac).
Then I don't go back and fix it afterward because I have too many thoughts and not enough time to express them. There are popular quotes about exactly this issue.
It's a kind of laziness in the form of my expression to give me more mental capacity to focus on the content. Alt 0151 and Alt 0150 are still burned into my memory from typing em dashes and en dashes so often on Windows.
I suppose I'll have to consider this my own punctuation mode collapse that RLHF is now forcing me to correct.
I use Grammarly because it helps fix speech recognition errors. One of the challenges of speech recognition use is that it is a bit difficult at times to construct grammatically correct sentences in your head, then speak those sentences, and then proofread them before you start the next bit of writing.
While Microkernels are great for overall security, it's also not obvious to me how it helped in this case.
Agreed on the point about hw-level mitigation. The leakage still exists. Containing it in a watertight box is quick and effective, and it does avoid extra overhead. But it doesn't patch the hole.
Is it just because it’s another VM switch to get to dom0? Seems a bit unlikely…
Xen has a hypervisor for dealing with the low level details of virtualization and uses dom0 for management and some HW emulation.
QEMU/KVM uses the host kernel for the low level details of virtualization and the QEMU userspace portion to do the actual HW emulation.
They’re actually remarkably similar aside from the detail that the Xen hypervisor only juggles VMs but the KVM design involves it juggling other normal processes…
The people praising Firecracker are just turning a blind eye to the 10000+ lines of (really hairy) C code in the kernel doing x86 instruction emulation and the actual hypervisor part.
It's weird to me that cloud hosts aren't absolutely swimming in cores now, but with Intel struggling and AMD somewhat resting on its laurels, which it stupidly did in the Hector Ruiz days, nothing is pushing the envelope. In 2010, fifteen years ago, we had 12 core CPUs.
In 2010 we had a billion or so transistors. In 2020, we had 50 billion. In 2010 we were at 28nm, now we're at 3nm.
We should have 100x the CPUs on die now or more. a thousand x86 cores, god knows how many Arms, and god knows how much you could do with hi-low core counts.
Anyway, what I'm getting at is all of these vulnerabilities across process execution or VM execution could be moot: if the processes were isolated to a core or set of cores, and the VM isolated to its own dedicated branch predictors in its own cores. Then go ahead and do whatever tricks you want. Obviously you don't want hyper-threading.
Hyper-V is a type 1 hypervisor, when enabled, which is required for many security measures in modern Windows, the first Windows instance is a privileged guest, just like with Xen.
Additionally anyone using WSL 2.0, is running another set of VMs alongside Windows, depending on how many flavours of Linux and containers are configured.
Imminent unification of Android and ChromeOS will likely use a similar h/w nested-virt architecture based on L0 pKVM + L1 KVM hypervisors on Arm devices.
Honda is using Xen, "How to accelerate Software Defined Vehicle" (2025), https://static.sched.com/hosted_files/xensummit2025/93/HowTo...
VM exceptions are all handled by VMM. A VM escape would still be confined in VMM, which has no higher capabilities than the VM itself. Capabilities are enforced by the formally verified seL4.
It's great to see an article highlighting the impact of VMScape on Xen, especially since our paper [1] does not discuss Xen in detail (we only briefly mention it in the blog post [2]).
That said, the article unfortunately lacks technical precision. Some statements are vague, and "our quote" ("According to the ETH team") is misleading, as those are not our words. To be clear: VMScape is not a cross-VM attack. So please treat such summaries with caution.
Here are some clarifications:
The core issue lies in the hardware. On all AMD Zen CPUs, the branch prediction unit cannot natively distinguish between host user, guest-1 user, and guest-2 user domains (newer Intel CPUs can do to some extend). Supervisor domains (host or guest kernel) are protected by the CPU effectively disabling speculative execution in those domains. But because user domains share branch predictor state, execution in one can control speculation in another - the fundamental root of Spectre-BTI. To enforce isolation, predictors must be flushed (IBPB) whenever transitioning between such domains.
On Linux KVM, an IBPB is issued on guest-1 to guest-2 switches and on process switches. However, because a guest runs in the same process as its userspace hypervisor (e.g. QEMU, firecracker, etc), there is no isolation mechanism in place for this transition. VMScape exploits exactly this gap. The mitigation is to add an IBPB on guest to host userspace transitions.
Xen, while also running on the same flawed hardware, is not vulnerable to VMScape. But the reason is not (just) asynchronism. Asynchronism makes exploitation only harder. Instead, the key reason is that the equivalent of Linux's userspace hypervisor runs inside Dom0 on Xen, which is itself "treated like a guest". Because Xen already issues IBPBs between guest transitions, Dom0 is protected from DomU.
Assigning responsibility for vulnerabilities at the hardware–software boundary is inherently challenging and often depends on implicit assumptions about the threat model. VMScape introduces a novel threat model that had not been considered before. Consequently, the responsible entities concluded that the lack of host/guest branch predictor state isolation does not qualify as a hardware issue, since adequate mitigations, such as IBPB, are readily available, but insufficiently used by software.
[1] https://comsec-files.ethz.ch/papers/vmscape_sp26.pdf [2] https://comsec.ethz.ch/research/microarch/vmscape-exposing-a...
The takeaway from that paper (imo, afaict) is that guest userspace can influence indirect predictor entries in KVM host userspace. I don't really know anything about Xen, but presumably it is unaffected because there is no Xen host userspace, just a tiny hypervisor running privileged code in the host context. With KVM, Linux userspace is still functional in the host context.
Presumably, the analogy to host kernel/userspace in KVM is dom0, but in Xen this is a guest VM. If cross-guest cases are mitigated in Xen (like in the case of KVM, see Table 2 in the paper), you'd expect that this attack just doesn't apply to Xen. Apart from there being no interesting host userspace, IBPB/STIBP might be enough to insulate other guests from influencing dom0. If you're already taking the hit of resetting the predictors when entering dom0, presumably you are not worried about this particular bug.
edit: Additional reading, see https://github.com/xen-project/xen/blob/master/xen/arch/x86/...