Correctness and Composability Bugs in the Julia Ecosystem (2022)
Posted3 months agoActive3 months ago
yuri.isTechstoryHigh profile
controversialmixed
Debate
80/100
Julia Programming LanguageSoftware CorrectnessProgramming Language Design
Key topics
Julia Programming Language
Software Correctness
Programming Language Design
The post discusses correctness and composability bugs in the Julia ecosystem, sparking debate among commenters about the language's reliability and the significance of the issues raised.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
16m
Peak period
49
0-3h
Avg / period
7.9
Comment distribution79 data points
Loading chart...
Based on 79 loaded comments
Key moments
- 01Story posted
Sep 30, 2025 at 11:46 AM EDT
3 months ago
Step 01 - 02First comment
Sep 30, 2025 at 12:02 PM EDT
16m after posting
Step 02 - 03Peak activity
49 comments in 0-3h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 2, 2025 at 1:13 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45427021Type: storyLast synced: 11/20/2025, 4:56:36 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
I mention this because this is definitely the sort of content that can age poorly. I have no direct experience, I've never so much as touched Julia.
The previous HN thread:
Correctness and composability bugs in the Julia ecosystem - https://news.ycombinator.com/item?id=31396861 - May 2022 (407 comments)
Edit: since there are (again? I seem to remember this last time) complaints about the title being a bit too baity, I've pilfered that previous title for this thread as well
When I posted the OP, I considered changing the title, but decided not to editorialize it, per the guidelines.
As an experiment, I would be interested to see if somebody would make a 1-based python list-like data structure (or a 0-based R array), to check how many 3rd party (or standard library) function would no longer work.
I've seen this article several times, and I'm sure Jerf has as well.
Our instinct, with years of being here, is it isn't a good fit for HN, at least at its current age and as labelled.
It is not conducive to healthy discussion to have an aged[1] blanket dismissal of a language coupled to an assertion that saying "the issues looked fixed?" is denial of people's lived experience.
[1] We can infer it was written in 2021, as the newest issue they created is from then, and they avowed never to use the language again.
https://github.com/scipy/scipy/issues?q=is%3Aissue%20state%3...
Scrolling through this list, it’s clear that many are “correctness issues.”
I do not link this to argue that scipy bugs are more serious or more frequent. I don’t think that kind of statistical comparison is meaningful.
However, I think a motivated reasoner could write a very similar blog post to the OP, but about $arb_python_lib instead of $arb_julia_lib.
I suppose my position is closer to “only Kahan, Boyd and Higham write correct numerical algorithms” (a hyperbole, but illustrative of the difficulty level).
Regardless, overall, these are grossly of another complexity and seriousness than the base sum function being just wrong, or cultural issues among developers with not verifying inputs "for performance", or things of that nature. The scientific Python community has, in my experience, a much higher adherence to good standards than that.
Yes, of course. I am not conflating the two.
> The scientific Python community has, in my experience, a much higher adherence to good standards than that.
Not in my experience. Nor am I defending Julia.
I thing a lot of them used “rstudio” to browse the data.
https://www.tidyverse.org/
Does this have any impact on the cosmological emulator written in Julia?
https://news.ycombinator.com/item?id=45346538
[1] https://news.ycombinator.com/item?id=45259623
Again I think this approach is too radical for most ecosystems, but Julia is pursuing a similarly radical level of composability/reusability and evidently encountering difficulties with it, so I think there may be a compatibility there.
There are some proposals to forbid the registration of a package release which trespasses on the internals of another package, though.
I hope someone tackles the above sooner or later, but another issue is the approach of testing every known dependent package might be very costly, both in terms of compute and manual labor, the latter because someone would have to do the work of maintaining a blacklist for packages with flaky unit tests. The good news is that this work might considerably overlap with the already existing PkgEval infrastructure. We'll see.
Any recommended libraries (or languages) that have thoroughly verified libraries?
This is true, and it's a powerful part of the language - but you can implement it incorrectly when you compose elements together that expect some attributes from the custom data type. There is no way to formally enforce that, so you can end up with correctness bugs.
With regard to power and flexibility, homoiconicity and getting to hook into compiler passes does make Julia powerful and flexible in a way that most other languages aren't. But I'm not sure if that power is what results in bugs — more likely it's the function overloading/genericness, whose power and flexibility I think is a bit overstated.
However, the spirit of the original post was about the lack of safeguards and cohesive direction by the community to find ways to preempt such errors. It's not an easy problem to solve since Julia's composability and flexibility adds complexity not encountered in other languages. The current solution is, 'users beware', while there are a few people working on ways to enforce correct composability. I think it's best to acknowledge that this is an ongoing issue and that it's not a problem anymore because the specific ones pointed out are fixed.
Using this approach since 2017 I've never really encountered the types of issues mentioned in Yuri's blog post. The biggest issue I've had is if some user-package makes a change that is effectively breaking but they don't flag it the associated release as breaking. But this isn't really a Julia issue so much as a user-space issue, and can happen in any language when relying on others' libraries.
But Julia fucked it up to where it's not clear what you're using, and library writers don't know which one has been passed! It's insane. They chose style over consistency and correctness and it's caused years of suffering.
My understanding is that it's a difficult problem to solve, and there are people working on traits/interfaces - but these are still peripheral projects and not part of the core mission to my knowledge. In practice, composability problems arise seldomly, but there is no formal way to guard against it yet. I believe there was some work done at Northeastern U. [1] toward this goal but it's still up to the user to "be careful", essentially.
[1] https://repository.library.northeastern.edu/files/neu:4f20cn...
In a language with pervasive use of generic methods, I don't know what actually means. If I write a function like:
Is it correct or not? What does "correct" even mean here? If you call it with values where `+` is defined on them and does what you expect, then my function probably does what you expect too. But if you pass values where `+` does something weird, then `add3()` does something weird too.Is that correct? What expectations should someone have about a function whose behavior is defined in terms of calls to other open-ended generic functions?
My point is that there is an implied contract that `add3()` only does what you expect if you pass it values where `+` happens to do what you expect. When you have a language with fully open generic methods like Julia, it's very powerful, but the trade-off is that every function is effectively like middleware where all it can really say is "if you give me things that to delegate to the right things, I'll do the right thing too".
When I'm writing `add3()`, I don't know what `+` does. I'm writing a function in terms of open-ended abstractions that I don't control, so it's very hard to make any promises about the semantics of my function.
In Julia it's almost as if every function is an interface, with (usually quite terse) documentation as its only semantic constraint. For example, here is the full documentation for `+`: https://docs.julialang.org/en/v1/base/math/#Base.:+
I love Game Programming Patterns, by the way! Laughed out loud when I first saw the back cover.
Right. I think a big part of this is expectation management. Julia lets you compose unrelated libraries much more freely than most other languages do. That's very powerful, but if you come into it expecting all of those compositions to magically work, I think you just have an unrealistic expectation.
There's no silver bullet when it comes to code reuse and Conway's Law can't be entirely avoided.
> I love Game Programming Patterns, by the way! Laughed out loud when I first saw the back cover.
:D
Yep, and it is unfortunate that this unrealistic expectation is explicitly encouraged by the creators of the language:
> It is actually the case in Julia that you can take generic algorithms that were written by one person and custom types that were written by other people and just use them together efficiently and effectively.
It seems worth reiterating that on a personal level I really like and appreciate the vast majority of the folks I’ve met in the Julia community. I’m glad I got to hang out with them and learn from them. But in my opinion setting expectations like this fosters bad science.
On the contrary, it is my impression the experienced Julia programmers, including those involved in JuliaLang/julia, take correctness seriously. More so than in many other PL communities.
> there are people working on traits/interfaces - but these are still peripheral projects and not part of the core mission to my knowledge
What exactly do you mean by "traits" or "interfaces"? Why do you think these "traits" would help with the issues that bug you?
I think you're actually even more active in the Julia community so maybe I don't have to summarize the debate but these are the types of traits and interface packages being developed that are meant to formalize how modules can be used and extended by others.
https://github.com/rafaqz/Interfaces.jl
https://discourse.julialang.org/t/interfaces-traits-in-julia...
• The documentation (currently) of the function warns not to use it this way;
• This is a rather perverse use of the function(s) that would be unlikely unless you’re trying to break things;
• The discussion on the issue page demonstrates the exact opposite of a culture not caring about correctness;
• This kind of stuff doesn’t matter to all the scientists who are actually using Julia to do real work.
Nevertheless, sum!() and friends should be, somehow, made to avoid this problem, certainly.
It is good the documentation is now explicit that the behavior is not guaranteed in this case, but even better would be if aliasing were detected and handled (at least for base Julia arrays, so that the warning would only be needed for non-base types).
Still, the lesson is that when using generic functions one should look at what they expect of their input, and if this isn't documented one should at least test what they are giving thoroughly and not assume it just works. I've always worked this way, and never run into surprises like the types of issues reported in the blog post.
Currently there is no documentation on what properties an input to `sum!` must support in the doc string, so one needs to test its correctness when using it outside of base Julia data types (I haven't checked the broader docs for an interface specification, but if there is one it really should be linked in the docstring).
It's my view that all the major points in the blog post are true, and the problem persists. It's slightly better now, because Julia has more usage in industry and less usage by hobby hackers.
I'm convinced it's caused by two factors: The first is the duck-typed, dynamic nature of the language, which, like Python, gives the developer no tools to check or enforce correctness.
More fundamentally, the culture of Julia is a cowboy hacking culture where we just start writing, and then we can always kick the code around once bugs appears. There seem to be an almost complete disinterest in careful documentation of behaviour and edge cases, or even actual descriptions of what some abstraction is supposed to do. The natural result is that people interpret all kinds of meaning to any abstraction and use them in slightly different ways. It's madness.
As an example, consider [the definition of Base.seek](https://docs.julialang.org/en/v1/base/io-network/#Base.seek). There is no description of what the position is, what type is can be or operations it's supposed to support. Nor that the seek position is typically zero-indexed. There is no description of what should happen for out of bounds seeks, or how it differs for files open in reading and writing mode. Nor any description of the errors it can throw.
I must emphasize that this kind of documentation is the norm, not the exception.
This kind of indifference towards actually specifying behaviour is not a foundation you can build a language on. And it's very hard to change in retrospect, because by now, seek means a bunch of different things in Base Julia and the ecosystem, and it would be breaking to change.
I've several times seen a core dev change some behaviour of some code because they clearly thought the behaviour was always meant to be X, even though it actually did Y, arguing that Y was an implementation detail. No shit - everything is an implementation detail when nothing is documented.
I think Julia needs to grow up and begin taking it's documentation and interfaces seriously.
I may refresh the post with more recent information at some point. In the meantime, those curious can find a short story of one newer correctness bug here: https://discourse.julialang.org/t/why-is-it-reliable-to-use-...
The person who eventually fixed the issue, mkitti, had to push through a lot of "institutional" friction to do so, and the eventual fix is the result of his determined efforts.
While his part of the story mostly played out in venues outside of the Discourse forum some of it is on display in this thread: https://discourse.julialang.org/t/csv-jl-findmax-and-argmax-...
* it happened in the first place
* it took so long to get noticed
* it took so long to merge the fix after being reported
I do feel like I should push back on the term "institutional friction." it was more of a bus factor problem; there were not enough (aka zero) maintainer eyeballs on the proposed fix. but there wasn't exactly anybody saying not to fix it, which is what I think of as friction.
Say you use some package for numerical integration. One day you cook up your own floating point type, and use the same package with success. Then you change your floating point type subtly, and suddenly weird things start to happen. Is it a correctness bug? Whose bug?
Surely, the author of the integration package didn't have your weird floating type in mind, but it still worked. Until you made it even weirder. These are the things some people think are correctness bugs in julia. It's mostly poor coding.
If there was one thing I could change about Julia it most certainly wouldn't be correctness issues in my own experience. Filling in the ecosystem in terms of boring glue type stuff like a production grade gRPC client would be amazing. This was the type of problem that almost got me to give up on the language.
The fact that bugs happen in software should not surprise anyone. Even software of critical importance, such as GCC or LLVM, whose correctness is relied upon by the implementations of many programming languages (including C, C++ and Julia itself), are buggy.
Instead the post could have focused more on actual design issues, such as some of the Base interfaces being underspecified:
> the nature of many common implicit interfaces has not been made precise (for example, there is no agreement in the Julia community on what a number is)
The underspecified nature of Number (or Real, or IO) is an issue, albeit not related with the rest of the blog post. It does not excuse the scaremongering in the blog post, however.
When creating a new type, it should be more clear cut when is subtyping Number (or Real, etc.) valid. Should unitful quantities be numbers? Should intervals be numbers? Related: I think there are some attempts by Tim Holy and others to create/document "thick numbers".
Furthermore, I believe it might be good to align the Number type hierarchy with math/abstract algebra as much as possible without breaking backwards compatibility, which might making Number, or some subtypes of it, actual interfaces.
> Subtyping Number is a way to opt into numeric promotion and a few other useful generic behaviors. That’s it.
OK, but I think that's not documented either.
I'm really surprised by the list of issues as some of those are pretty recent (2024) and pretty important parts of the ecosystem like ordereddict.
The language is elegant, intuitive and achieves what it promises 99% of the time, but that’s not enough compared to other programming languages.