Simd City: Auto-Vectorisation
Key topics
The unpredictable nature of auto-vectorization has sparked a lively debate, with some commenters praising the compiler's occasional "magical" abilities to deduce vectorization, while others lament its unreliability, citing significant performance hits when it fails to trigger. As one commenter noted, the issue is particularly pronounced with SIMD and floating-point numbers, a topic recently explored by Matt Godbolt. Meanwhile, others are looking to the future, speculating about the potential for large language models to revolutionize compiler optimization, with some even envisioning LLMs as disassemblers. The discussion highlights the complex trade-offs between performance, predictability, and semantic correctness in compiler design.
Snapshot generated from the HN discussion
Discussion Activity
Moderate engagementFirst comment
7d
Peak period
10
156-168h
Avg / period
9.5
Based on 19 loaded comments
Key moments
- 01Story posted
Dec 20, 2025 at 8:25 AM EST
15 days ago
Step 01 - 02First comment
Dec 27, 2025 at 12:03 AM EST
7d after posting
Step 02 - 03Peak activity
10 comments in 156-168h
Hottest window of the conversation
Step 03 - 04Latest activity
Dec 27, 2025 at 4:59 PM EST
7 days ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
It's just really hard to detect and exploit profitable and safe vectorization opportunities. The theory behind some of the optimizers is beautiful, though: https://en.wikipedia.org/wiki/Polytope_model
A way to express the operations you want, without unintentionally expressing operations you don't want, would be much easier to auto-vectorise. I'm not familiar enough with SIMD to give examples, but if a transformation would preserve the operations you want, but observably be different to what you coded, I assume it's not eligible (unless you enable flags that allow a compiler to perform optimisations that produce code that's not quite what you wrote).
Matt Godbolt wrote about it recently.
https://xania.org/202512/21-vectorising-floats
TLDR, math notation and language specify particular orders in which floating point operations happen, and precision limits of IEEE float representation mean those have to be honoured by default.
Allowing compilers to reorder things in breach of that contract is an option, but it comes with risks.
the only thing that might stand in the way is a dependence on reproducibility, but it seems like a weak argument: We already have a long history of people trying to push build reproducibility, and for better or worse they never got traction.
same story with LTO and PGO: I can't think of anyone other than browser and compiler people who are using either (and even they took a long time before they started using them). judged to be more effort than its worth i guess.
Alas the standards committee is always asking for people like us to join but few of our billion dollar companies will pony up any money. This is despite many of them having custom forks of clang that they maintain.
There is a large presence from the trading industry, less from gaming but you still see a lot of those guys.
An ML model can fit into existing compiler pipelines anywhere that heuristics are used though, as an alternative to PGO.
I tried it a year or so back and was sorta disappointed at the results beyond simple cases, but it feels like an area that could improve rapidly.
I remember intel had something like it but it went nowhere.
You don't want "vectorization" though, you either want
a) a code generation tool that generates exactly the platform-specific code you want and can't silently fail.
b) at least a fundamentally vectorized language that does "scalarization" instead of the other way round.