Default Musl Allocator Considered Harmful to Performance

Posted4 months agoActive4 months ago

fanf2

104 points

90 comments

nickb.devTechstoryHigh profile

heatednegative

Debate

80/100

Musl AllocatorPerformance OptimizationLinux Distributions

Key topics

Musl Allocator

Performance Optimization

Linux Distributions

The default musl allocator is criticized for its performance issues, particularly in multi-threaded applications, sparking a discussion about the trade-offs between size optimization and performance in Linux distributions.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

Peak period

Day 3

Avg / period

Comment distribution90 data points

Loading chart...

Based on 90 loaded comments

Key moments

01Story posted
Sep 5, 2025 at 4:42 PM EDT
4 months ago
Step 01
02First comment
Sep 8, 2025 at 12:16 AM EDT
2d after posting
Step 02
03Peak activity
81 comments in Day 3
Hottest window of the conversation
Step 03
04Latest activity
Sep 15, 2025 at 9:58 AM EDT
4 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (90 comments)

Showing 90 comments

burntcaramel

4 months ago

3 replies

Instead of “harmful to performance”, why can’t we say “slow”?

Harmful should be reserved for things that affect security or privacy e.g. accidentally encourage bugs like goto does.

dwattttt

4 months ago

1 reply

Only once we convince C developers that a lack of performance isn't inherently harmful.

DaSHacka

4 months ago

2 replies

More like Python / JS devs

C devs are the few I've met that seem to actually care.

amonith

4 months ago

1 reply

I think you misunderstood. That's exactly the problem. C developers consider slow performance harmful, which is often dumb.

1718627440

4 months ago

1 reply

Except that's why I use hundreds of C programs every day, but complain about the few python programs and all the sloppy websites.

amonith

4 months ago

5 replies

You do you. Most people don't care about software that much in general. The most important thing is that it does the job and it does it securely. C won't help you with bugs in any shape or form (in fact it's famously bug-friendly), so it often makes more sense to use a tech stack that either helps with those or lowers the cost on the developer side.

1718627440

4 months ago

1 reply

> Most people don't care about software that much in general.

This is an example of not caring about the software per se, but only about the outcome.

> [C is] in fact it's famously bug-friendly

Yes, but as a user I like that. I have a game that from the user-experience seams to have tons of use-after-free bugs. You see that as a user, as strings shown in the UI suddenly turn to garbage and then change very fast. Even with such fatal bugs, the program continues to work, which I like as a user, since I just want to play the game, I don't care if the program is correct. When I want to get rid of these garbage text, I simply close the in-game window and reopen it and everything is fine.

On the other side there are games written in Pascal or Java, which might not have that much bugs, but every single null pointer exception is fatal. This led to me not playing the games anymore, because being good and then having the program crash is so frustrating. I rather have it running a bit longer with silent corruption.

denotational

4 months ago

A null-pointer dereference in C will be just as fatal (modulo optimizations).

kelnos

4 months ago

1 reply

I think people also care that software runs reasonably quickly. Among non-technical people, "my Windows is slow" seems to be a common complaint.

amonith

4 months ago

1 reply

Sure, but this is perceived performance and it's 100% unrelated to the language. It's bugs, I/O, telemetry, updates, ads, other unnecessary background things, or just dumb design (e.g. showing onedrive locations first when trying to save a file in Word) in general.

C won't help with any of that. Unless the cost of development using it will scare away management which requests those dumb features. Fair enough then :)

renox

4 months ago

> or just dumb design (e.g. showing onedrive locations first when trying to save a file in Word)

Your example is not one of a 'dumb' design, it is a deliberate 'dark pattern' --> pushing you to use OneDrive as much as possible so that to earn more money.

GuB-42

4 months ago

1 reply

People care about the performance. There are numerous studies about that, showing, for instance a direct correlation between how fast a page loads and conversion rate. Also, Chrome, initially, the pitch was almost all about performance, and it was. They only became complacent once they got their majority market share.

It makes sense to use a tech stack that lowers the cost on the developer side in the same way that it makes sense to make junk food. Why produce good, tasty food when there is more money do be made by just selling cheap stuff, it does the most important thing: give people calories without poisoning them (short term).

amonith

4 months ago

Yeah but we're mentioning the performance of the language. People do have a baseline level of accepted performance, but this is about perceived performance and if software feels slow most of the time it's just because of some dumb design. Like a decision to show an animated request to sign up for the newsletter on the first visit. Or loading 20 high quality images in a grid view on top of the page. Or just in general choosing animations that just feel slow even though they're hitting the FPS target perfectly without hiccups.

Get rid of those dumb decisions and it could have been pure JS and be 100% fine. C has no value here. The slow performance of JS is not harmful here. Discord is fast enough although it's Electron. VS Code is also fast enough.

But I'd also like to respond to the food analogy, since it's funny.

Let's say that going full untyped scripting language would be the fast food. You get things fast, it does the job, but is unhealthy. You can write only so much bash before throwing up.

Developing in C is like cooking for those equally dumb expensive unsustainable restaurants which give you "an experience" instead of a full healthy meal. Sure, the result uses the best ingredients, it's incredibly tasty but there's way too little food for too much cost. It's bad for the economy (the money should've been spent elsewhere), bad for the customer (same thing about money + he's going to be hungry!) and bad for the cook (if he chose a different job, he'd contribute to the society in better ways!) :D

Just go for something in the middle. Eat some C# or something.

dijit

4 months ago

externalising developer cost onto runtime performance only makes sense if humans will spend more time writing than running (in aggregate).

Essentially you’re telling me that the software being made is not useful to many people; because the cost of writing the software (a handful of developers) will spend more time writing the software than their userbase will in executing their software.

Otherwise you’re inflicting something on humanity.

Dumping toxic waste in a river is much cheaper than properly disposing of it too; yet we understand that we are causing harm to the environment and litigate people who do that.

Slow software is fine in low volumes (think: shitting in the woods) but dumping it on huge numbers of users by default is honestly ridiculous (Teams, I’m looking at you: with your expectation to run always and on everyones machine!)

hulitu

4 months ago

> The most important thing is that it does the job and it does it securely

ROTFL. Is there any security audit ? /s

it does the job - mostly.

tialaramex

4 months ago

A language community which so prizes the linked list is in no position to go throwing such stones.

Linux lucked out, when you're doing tricky wait free concurrent algorithms that intrusive linked list you hand designed was a good choice. But over in userland you'll find another hand rolled list in somebody's single threaded file parser and oh, the growable array would be fifty times faster, shame the C programmer doesn't have one in their toolbox.

jayd16

4 months ago

1 reply

"Considered harmful" is a meme they're referencing but yeah...its pretty stale at this point.

Agingcoder

4 months ago

3 replies

To me it’s not a meme, it’s a reference to a very famous letter by dijkstra regarding goto statements.

https://en.wikipedia.org/wiki/Considered_harmful

jayd16

4 months ago

That is the meme.

tialaramex

4 months ago

Not really "goto statements" so much as the go-to arbitrary control flow semantic aka jump.

C's goto is a housecat to the full blown jump's tiger. No doubt an angry housecat is a nuisance but the tiger is much more dangerous.

C goto won't let you jump straight into the middle of unrelated code, for example, but the jump instruction has no such limit and neither did the feature Dijkstra was discussing.

bmn__

4 months ago

You want to object because of a misunderstanding. The usage of the word meme here is correct in its original sense. The word cliche would also work.

sim7c00

4 months ago

2 replies

maybe its not 'slow' but more 'generalized for a wide range of use-cases'? - because is it really slow for what it does, or simply slower compared to a specialized implementation? (this is calling a regular person car slow compared to an F1 car... sure the thing is fast but good luck takin ur kids on holiday or doing weekly shopping runs?)

masklinn

4 months ago

1 reply

“Generalised to a wide range of use cases” is a really strange way to say “unsuitable to most multi-threaded programs”.

In 2025 an allocator not cratering multi-threaded programs is the opposite of specialisation.

flohofwoe

4 months ago

1 reply

It only matters when your threads allocate with such a high frequency that they run into contention.

A too high access frequency to a shared resource is not a "general case", but simply poorly designed multithreaded code (but besides, a high allocation frequency through the system allocator is also poor design for any single-threaded code, application code simply should not assume any specific performance behaviour from the system allocator).

Joker_vD

4 months ago

Well, what is "such a high frequency"? Different allocators have different breaking points, and the musl's one is apparently very low.

> application code simply should not assume any specific performance behaviour from the system allocator

Technically, yes. Practically, no; that's why e.g. C++ standard mandates time complexity of its containers. If you can't assume any specific performance from your system, that means you have to prepare for every system-provided functionality to be exponentially slow and obviously you can't do that.

Take, for instance, the JSON parser in GTA V [0]: apparently, sscanf(buffer, "%d", &n) calls strlen(buffer) internally, so using it to parse numbers in a hot loop on 2 MiB-long JSON craters your performance. On one hand, sure, one can argue that glibc/musl developers are within their right to implement sscanf however inefficiently they want, and the application developers should not expect any performance targets from it, and therefore, probably should not use it. On the other hand, what is even the point of the standard library if you're not supposed to use it for anything practical? Or, for that matter, why waste your time writing an implementation that no-one should use for anything practical anyhow, due to its abysmal performance?

[0] https://news.ycombinator.com/item?id=26296339

saagarjha

4 months ago

glibc is faster in basically every usecase, though.

userbinator

4 months ago

1 reply

I believe musl is supposed to be optimised heavily for size, not speed.

stonogo

4 months ago

2 replies

Specifically its goals are low memory overhead and hardening. Safe defaults, and easy to swap to a performance-oriented malloc for those apps that want it.

My question is: why is Rust performance contingent on a C malloc?

masklinn

4 months ago

1 reply

> why is Rust performance contingent on a C malloc?

Because Rust switched to “system” allocators way back for compatibility with, well, the system, as well as introspection / perf tooling, to lower the size of basic programs, and to lower maintenance.

It used to use jemalloc, but that took a lot of space in even the most basic binary and because jemalloc is not available everywhere it still had to deal with system allocators anyway.

flohofwoe

4 months ago

3 replies

So basically, the Rust project made a bad decision and now it's all musl's fault? ;)

buster

4 months ago

1 reply

Sounds like a sane decision to me? Using musl is the developers decision, not Rusts.

flohofwoe

4 months ago

1 reply

It's not a developer decision on Alpine where musl is the system allocator. Otherwise I fully agree, application developers are mainly responsible for the performance of their applications.

gkbrk

4 months ago

Using the system allocator is also a developer decision. They can use any custom allocator they want. A lot of programs use Jemalloc regardless of what the system allocator is.

masklinn

4 months ago

1 reply

The rust project made a sensible decision given its direction and goals, and musl’s allocator is garbage for any multithreaded program.

flohofwoe

4 months ago

3 replies

> and musl’s allocator is garbage for any multithreaded program.

...it only matters if the threads allocate/free with such a high frequency that they run into contention, the C stdlib allocator is a shared resource and user code really shouldn't assume that the allocators fixes their poor design decisions for multithreaded code.

imtringued

4 months ago

1 reply

Intel has announced a desktop CPU with 52 cores.

Edit: To be more precise, an engineering sample was spotted.

masklinn

4 months ago

AMD's threadrippers had 64 cores in 2020. The workstation targeted threadripper pro reaches 96. These are desktop parts, the top end of their server offering has 192 cores.

kelnos

4 months ago

1 reply

Ah yes, the "you're holding it wrong" argument.

If other allocators are able to handle a situation perfectly well, even a general-purpose allocator like the one in glibc, that suggests that musl's is deficient.

flohofwoe

4 months ago

glibc's allocator is about 10x more code than musl's. Why should it be controversial that different C stdlib implementations set different priorities?

A smaller code base also means a smaller attack surface and fewer potential bugs.

The question remains: why does the Rust ecosystem depend so much on a system component they ultimately have no control over?

masklinn

4 months ago

“The allocator is perfectly fine as long as you don’t use it” is more a confirmation than a disagreement.

sim7c00

4 months ago

never blame rust. rust is the replacement for C.

Aissen

4 months ago

1 reply

It's only for Rust binaries that are built with the the -linux-musl* (instead -linux-gnu*) toolchains, which are not the default, and usually used to make portable/static binaries.

ameliaquining

4 months ago

1 reply

Unless you're on a distro like Alpine where musl is the system libc. Which is common in, e.g., containers.

Aissen

4 months ago

It's still possible to build Rust binaries with jemalloc if you need the performance (or another allocator). Also, it will heavily depend on the usecase; for many usecases, Rust will in fact pressure the heap less, precisely because it tracks ownership, and passing or returning structs by value (on the stack) is often "free" if you pass the ownership as well.

rurban

4 months ago

3 replies

Rich replaced the default musl malloc some time ago for exactly those reasons. Maybe they still used the old musl libc?

The new one was drafted here: https://github.com/richfelker/mallocng-draft

masklinn

4 months ago

1 reply

The new allocator does nothing to improve the performances in a threaded / contended application: https://www.openwall.com/lists/musl/2025/09/04/3

typpilol

4 months ago

The response to the link here is really telling.

Blames it all on app code like Wayland

alberth

4 months ago

1 reply

From the article:

> “the new ng allocator in MUSL doesn’t make a dime of a difference”

rurban

4 months ago

Yes, sorry, missed that at the very end.

Aissen

4 months ago

This is addressed in the article: https://nickb.dev/blog/default-musl-allocator-considered-har...

jauntywundrkind

4 months ago

3 replies

Alas this is a huge foot gun that ensnares many orgs. Because engineers seem drawn like moths to the flame to Alpine container images. Yes they are small, but the ramifications of Alpine & using musl are significant.

Optimizing for size & stdlib code simplicity is probably not the best fit for your application server! Container size has always struck me as such a Goodhart's Law issue (and worse, already a bad measure as it measures only a very brief part of the software lifecycle). Goodhart's Law:

> When a measure becomes a target, it ceases to be a good measure

This particular musl/Alpine footgun can be worked around. It's not particularly hard to install and use another allocator on Alpine or anywhere really. Ruby folks in particular seem to have a lot of lore around jemalloc, with various versions preferences and MALLOC_CONFIGs on top of that. But in general I continue to feel like Alpine base images bring in quite an X factor, even if you knowingly adjust the allocator: the prevalence of Alpine in container images feels unfortunate & eccentric.

Going distorless is always an option. A little too radical for my tastes though usually. I think of musl+busybox+ipkg as the distinguishing aspects of Alpine, so on that basis I'm excited to see the recent huge strides by uutil, the rust rewrite of gnu coreutils focused on compatibility. While offering a BusyBox-like all-in-one binary convenience! It should make a nice compact coreutils for containers! The recent 0.2 has competitive performance which is awesome to see. https://www.phoronix.com/news/Rust-Coreutils-0.2

Karrot_Kream

4 months ago

1 reply

Huh I guess I'm lucky I never faced this, we've always used Debian or RHEL containers where I've worked. Every time I toyed with using a minimalist distro I found debugging to be much more difficult and ended up abandoning the idea.

Once the container OS forks and runs your binary, I'm curious why does it matter? Is it because people run interpreted code (like Python or Node) and use runtimes that link musl libc? If you deploy JVM or Go apps this will probably not be a factor.

jauntywundrkind

4 months ago

Jvm will also use whatever libc is available, afaik. Here's an article on switching a jvm container to jemalloc from 2021. But this isn't for the heap, it's just for the jvm itself & io related concerns! https://blog.malt.engineering/java-in-k8s-how-weve-reduced-m...

Go is a rare counter example, which ignores the system allocator & bundles its own.

pixelbeat

4 months ago

GNU coreutils can be built as a single binary with ./configure --enable-single-binary. One can install this variant on Fedora for example with the coreutils-single package, and this is used in some container images.

jurschreuder

4 months ago

I'm not a fan of Rust I'm more of a C++ guy but Ripgrep is also nice I always install it.

time4tea

4 months ago

4 replies

Maybe its just that the allocator is absolutely fine for single thread programs, and that's what a lot of programs are...

Its not so long ago that the GNU libc had a very similar allocator too, and thats why you'd pop Hoard in your LD_PRELOAD or whatever.

Not every program is multi-threaded, and so not every program would experience thread contention.

simonask

4 months ago

1 reply

Hot take: Almost all programs are actually multithreaded. The only exception is tiny UNIX-like shell utilities that are meant to run in parallel with other processes, and toy programs.

The third exception is programs that should be multithreaded but aren't because they are written in languages where adding more threads is disproportionately hard (C, C++) or impossible (Python, Ruby, etc.).

sim7c00

4 months ago

2 replies

how are C/C++ disproportionally hard? the concept of multi-threading is the same for any language that supports it, most of the primitives are the same, and it's really not a lot nor complicated code to implement those.

the difficulty totally lies in the design... actually using parallelism where it matters. - tons of multi-threaded programs are just single-thread with a lot of 'scheduler' spliced into this one thread -_-

nrdvana

4 months ago

1 reply

Maybe the large number of standard library functions that operate on globals and require you to remember the "_r" variant of that function exists, or the mess with handling signals, or the fact that Win32 and Posix use significantly different primitives for synchronization? Or maybe just the fact that most libraries for C/++ won't have built-in threading support and you need to synchronize at each call site?

Unless I'm writing Java, I avoid multithreading whenever possible. I hear it's also nice in Go.

simonask

4 months ago

Go is kind of broken here, since multithreading is one of extremely few ways to cause UB in Go.

Rust is very much best in class here.

simonask

4 months ago

It is disproportionately hard to do multithreading in C and C++ because the blast radius is huge and the tooling is not good. Languages with runtimes (Java, C#, etc.) give you lots of analysis and well-defined failure modes (in most cases), and Rust prevents almost all related bugs through the type system.

In terms of effort or expense, making any C or C++ program multithreaded is at least an order of magnitude harder/more expensive, even when designed for it from the beginning, so lots of programs aren't multithreaded that could be.

citrin_ru

4 months ago

1 reply

glibc malloc still doesn't work well for multi-threaded apps. It is prone to memory fragmentation which causes excessive memory usage. One can reduce number of arenas using MALLOC_ARENA_MAX environment variable and in many cases it's a good idea but it could increase lock contention.

If you care about efficiency of a multi-threaded app you should use jemalloc (sadly no longer maintained but still works well), mi-malloc or tcmalloc.

adgjlsfhk1

4 months ago

Glibc malloc also has a fun bug where it doesn't return memory to the OS to make it look better on benchmarks.

petcat

4 months ago

Programs that tend to have higher performance requirements are typically multi threaded and those are the ones that are also hit particularly hard by this issue.

imtringued

4 months ago

I'm not seeing how this justifies a 700x performance difference.

torginus

4 months ago

2 replies

The root cause of the issue, is that musl malloc uses a single head, and relies on locking to support multiple heaps. This means each allocation/free must acquire this lock. Imo it's good for single threaded programs (which might've been musls main usecase), but Rust programs nowadays mostly use multiple threads.

In contrast mimalloc, a similarly minimalistic allocator has a per-thread heap, which each thread owning the memory it allocates, and cross-thread free's are handled in a deferred manner.

This works very well with Rust's ownership system, where objects rarely move between threads.

Internally, both allocators use size-class based allocation, into predefined chunks, with the key difference being that musl uses bitmaps and mimalloc uses free lists to keep track of memory.

Musl could be fixed, it they switch from a single thread model, to a per-thread heap as well.

flohofwoe

4 months ago

1 reply

> a similarly minimalistic allocator

mimalloc has about 10kloc, while (assuming I'm looking in the right place) the new musl allocator has 891 and the old musl allocator has 518 lines of code. I wouldn't call an order of magnitude difference in line count 'similar'.

torginus

4 months ago

It's minimalistic in the sense that it compiles to a tiny binary (a lot of the code is either per platform, musl is POSIX only afaik) or for debugging. Yes it's bigger, but still tiny compared to something like jemalloc, and I'm sure it's like 10kb in a binary.

adgjlsfhk1

4 months ago

yeah, the Mimalloc design is just the correct one.

flarecoder

4 months ago

1 reply

For docker images, cgr.dev/chainguard/wolfi-base (https://images.chainguard.dev/directory/image/wolfi-base/ver...) is a great replacement for Alpine. Wolfi is glibc based. It's easy to switch from Alpine since Wolfi uses apk for package management with similar package names and also contains busybox like Alpine.

dijit

4 months ago

1 reply

I’d much rather go with distroless, if its a choice.

But I think you can tweak musl to perform well, and musl is closer to the spec than glibc so I would rather use it; even if its slower in the default case for multithreaded programmes.

masklinn

4 months ago

1 reply

> But I think you can tweak musl to perform well

You can not, its allocator does thread safety via a big lock and that’s that.

> musl is closer to the spec than glibc

Is it?

> even if its slower in the default case for multithreaded programmes.

That’s far from the only situation where it’s slower though.

dijit

4 months ago

1 reply

Yeah, the musl people tend to closely follow the spec, this doesn’t always win them friends: https://news.ycombinator.com/item?id=22682510

Swapping out jemalloc for the system allocator will net you huge performance wins if you link against musl, but you’ll still have issues with multithreading performance due to the slower implementations of necessary helpers.

adgjlsfhk1

4 months ago

Sometimes the spec sucks. A lot of the UNIX specs were written before anyone knew how to program multi-threaded systems, and thus are impossible to implement correctly (setenv is probably the most famous example)

flohofwoe

4 months ago

1 reply

My simple rule of thumb: if the general purpose allocator shows up in performance profiles, then there's too much allocation going on in the hot path (e.g. depending on the 'system allocator' being fast in all situations is a convenient but sloppy attitude for code that's supposed to be portable since neither the C standard nor POSIX say anything performance).

saagarjha

4 months ago

1 reply

They don't, but if your C standard library is slow you should get a new one.

flohofwoe

4 months ago

FWIW on Emscripten I specifically pick the slow-but-small emmalloc instead of the fast-but-big jemalloc because a small size matters more than performance in that case. My C code also rarely heap-allocates, and the few heap-allocations that happen are all in the init-phase, not in the hot path - e.g. even in multithreaded code, the MUSL allocator would be totally fine.

Performance in edge-cases by far isn't the only metric that matters for allocators.

MereInterest

4 months ago

3 replies

> Corollary: hats off to Red Hat for supporting their distro releases for such a lengthy period of time.

This has been my bane at various open source projects, because at some point somebody will say that all currently supported Linux distributions should be supported by a project. This works as a rule of thumb, except for RHEL, which has some truly ancient GCC versions provided in the "extended support" OS versions.

* The oldest supported versions in "production" is RHEL 8, and in "extended support" is RHEL 7. * RHEL 8 (released 2019) provides gcc 8 (released May 2018). RHEL 7 (released 2014) provides gcc 4.8 (released March 2013). * gcc 8 supports C++17, but not C++20. gcc 4.8 supports most of C++11 (some C++ stdlib implementations weren't added until later), but doesn't support C++14.

So the well-meaning cutoff of "support the compiler provided by supported major OS versions" becomes a royal pain, since it would mean avoiding useful functionality in C++17 until mid-2024 (when RHEL 7 went from "production" to "extended support") or mid-2028 (when RHEL 7 "extended support" will end). It's not as bad at the moment, since C++20 and C++23 were relatively minor changes, but C++26 is shaping up to be a pretty useful change, and that wouldn't be usable until around 2035 when RHEL 10 leaves "production".

I wouldn't mind it as much if RHEL named the support something sensible. By the end of a "production" window, the OS is still absolutely suitable as a deployment platform for existing software. Unlike other "production" OS versions, though, it is no longer reasonable as a target for new development at that point.

Aissen

4 months ago

1 reply

RHEL has gcc-toolset-N (previously devtoolset-N-gcc) for that. It's perfectly fine to only support building a project with, say, the penultimate gcc-toolset. Or ask for a payment for support, which is the norm in this (LTS) space.

MereInterest

4 months ago

Oh, absolutely, and I usually push for having users installed a more recent compiler. The problem comes when the compatibility policy is defined in terms of the default compiler provided, because then it requires a larger discussion around that entire policy.

phoronixrly

4 months ago

> at some point somebody will say that all currently supported Linux distributions should be supported by a project

Ask for payment for extended support as well.

nemetroid

4 months ago

GCC 12 is available for RHEL 7.

kchoudhu

4 months ago

1 reply

Can someone please write a '"considered harmful" considered harmful' piece.

mtndew4brkfst

4 months ago

1 reply

Here's one from 2002: https://meyerweb.com/eric/comment/chech.html

kchoudhu

4 months ago

Perfect.

M95D

4 months ago

Quote from the musl mailing list:

> The mallocng allocator was designed to favor very low memory overhead, low worst-case fragmentation cost, and strong hardening over performance. This is because it's much easier and safer to opt in to using a performance-oriented allocator for the few applications that are doing ridiculous things with malloc to make it a performance bottleneck than to opt out of trading safety for performance in every basic system utility that doesn't hammer malloc.

[1]https://www.openwall.com/lists/musl/2025/09/05/3

fyrn_

4 months ago

The musl pthread muxtexes are also awfully slow: https://justine.lol/mutex/

znpy

4 months ago

Another day, another reason to avoid musl libc

valyala

4 months ago

We also hit this scalability issue at the default memory allocator for musl. Switching back to glibc allowed increasing the performance in production by 5x on a machine with 96 CPU cores [1].

[1] https://github.com/VictoriaMetrics/VictoriaLogs/issues/517

anthk

4 months ago

Chimera Linux did some changes on their distro because of that.

EDIT: Ah, they were mentioned, of course.

On some malloc replacements, telescope -a gopher/gemini client- used to be a bit crashy until I used jemalloc on some platforms (with LD_PRELOAD).

Also, the performance rendering pages with tons of links improved a lot.

View full discussion on Hacker News

ID: 45143347Type: storyLast synced: 11/20/2025, 5:39:21 PM

Want the full context?