Zmij: Faster Floating Point Double-to-String Conversion
Key topics
The quest for faster double-to-string conversion just got a significant boost with the introduction of Zmij, a new algorithm that outpaces its predecessors. Commenters were impressed, with the creator of Grisu, an earlier breakthrough, praising Zmij's performance, and the author revealing that Zmij borrowed an idea from Cassio Neri's work on Teju Jaguá. As contributors chimed in, discussing potential comparisons with Teju Jaguá and sharing their own implementations, a broader question emerged: why do most research efforts focus on double-to-string conversion, leaving string-to-double algorithms in the shadows?
Snapshot generated from the HN discussion
Discussion Activity
Active discussionFirst comment
3d
Peak period
18
84-96h
Avg / period
7.2
Based on 43 loaded comments
Key moments
- 01Story posted
Dec 14, 2025 at 10:42 AM EST
20 days ago
Step 01 - 02First comment
Dec 17, 2025 at 3:45 PM EST
3d after posting
Step 02 - 03Peak activity
18 comments in 84-96h
Hottest window of the conversation
Step 03 - 04Latest activity
Dec 20, 2025 at 12:00 AM EST
15 days ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
When I published Grisu (Google double-conversion), it was multiple times faster than the existing algorithms. I knew that there was still room for improvement, but I was at most expecting a factor 2 or so. Six times faster is really impressive.
I wonder how Teju Jaguá compares. I don't see it in the C++ benchmark repo you linked and whose graph you included.
I have contributed an implementation in Rust :) https://crates.io/crates/teju it includes benchmarks which compare it vs Ryu and vs Rust's std lib. It's quite easy to run if you're interested!
> A more interesting improvement comes from a talk by Cassio Neri Fast Conversion From Floating Point Numbers. In Schubfach, we look at four candidate numbers. The first two, of which at most one is in the rounding interval, correspond to a larger decimal exponent. The other two, of which at least one is in the rounding interval, correspond to the smaller exponent. Cassio’s insight is that we can directly construct a single candidate from the upper bound in the first case.
Another nice thing about your post is mentioning the "shell" of the algorithm, that is, actually translating the decimal significant and exponent to a string (as opposed to the "core", turning f * 2^e into f' * 10^e'). A decent chunk of the overall time is spent there, so it's worth optimising it as well.
Error: roundtrip fail 4.9406564584124654e-324 -> '5.e-309' -> 4.9999999999999995e-309
Error: roundtrip fail 6.6302941479442929e-310 -> '6.6302941479443e-309' -> 6.6302941479442979e-309
Error: roundtrip fail -1.9153028533493997e-310 -> '-1.9153028533494e-309' -> -1.9153028533493997e-309
Error: roundtrip fail -2.5783653320086361e-312 -> '-2.57836533201e-309' -> -2.5783653320099997e-309
Is it, though? It's genuinely hard for me to tell.
There's both serialization and deserialization of data sets with, e.g., JSON including floating point numbers, implying formatting and parsing, respectively.
Source code (including unit tests etc.) with hard-coded floating point values is compiled, linted, automatically formatted again and again, implying lots of parsing.
Code I usually work with ingests a lot of floating point numbers, but whatever is calculated is seldom displayed as formatted strings and more often gets plotted on graphs.
The conversion to string should produce a hexadecimal number, not a decimal number, so that both serialization and deserialization are trivial and they cannot introduce any errors.
Even if a human inspects the strings produced in this way, comparing numbers to see which is greater or less and examining the order of magnitude can be done as easy as with decimal numbers. Nobody will want to do arithmetic computations mentally with such numbers.
Unlike formatting, correct parsing involves high precision arithmetic.
Example: the IEEE 754 double closest to the exact value "0.1" is 7205759403792794*2^-56, which has an exact value of A (see below). The next higher IEEE 754 double has an exact value of C (see below). Exactly halfway between these values is B=(A+C)/2.
A=0.1000000000000000055511151231257827021181583404541015625
B=0.100000000000000012490009027033011079765856266021728515625
C=0.10000000000000001942890293094023945741355419158935546875
So for correctness the algorithm needs the ability to distinguish the following extremely close values, because the first is closer to A (must parse to A) whereas the second is closer to C:
0.1000000000000000124900090270330110797658562660217285156249
0.1000000000000000124900090270330110797658562660217285156251
The problem of "string-to-double for the special case of strings produced by a good double-to-string algorithm" might be relatively easy compared to double-to-string, but correct string-to-double for arbitrarily big inputs is harder.
Parsing to binary is often undesirable to begin with.
https://old.reddit.com/r/rust/comments/omelz4/making_rust_fl...
Formatting also requires high precision arithmetic unless you disallow user-specified precision. That's why {fmt} still has an implementation of Dragon4 as a fallback for such silly cases.
https://vitaut.net/posts/2025/smallest-dtoa/
And there’s one detail I found confusing. Suppose I go through the steps to find the rounding interval and determine that k=-3, so there is at most one integer multiple of 10^-3 in the interval (and at least one multiple of 10^-4). For the sake of argument, let’s say that -3 worked: m·10^-3 is in the interval.
Then, if m is not a multiple of 10, I believe that m·10^-3 is the right answer. But what if m is a multiple of 10? Then the result will be exactly equal, numerically, to the correct answer, but it will have trailing zeros. So maybe I get 7.460 instead of 7.46 (I made up this number and I have no idea whether any double exists gives this output.) Even though that 6 is definitely necessary (there is no numerically different value with decimal exponent greater than -3 that rounds correctly), I still want my formatter library to give me the shortest decimal representation of the result.
Is this impossible for some reason? Is there logic hiding in the write function to simplify the answer? Am I missing something?
Congratulations, can't wait to have some time to study this further
C++ also provides countl_zero: https://en.cppreference.com/w/cpp/numeric/countl_zero.html. We currently use our own for maximum portability.
I considered computing the table at compile time (you can do it in C++ using constexpr) but decided against it not to add compile-time overhead, however small. The table never changes so I'd rather not users pay for recomputing it every time.
I have been doing some tests. Is it correct to assume that it converts 1.0 to "0.000000000000001e+15".
Is there a test suite it is passing?
That is what Schubfach does
The bottleneck are the 3 conditionals: - positive or negative - positive or negative exponent, x > 10.0 - correction for 1.xxxxx * 2^Y => fract(log10(2^Y)) 1.xxxxxxxx > 10.0