Linux 6.18 Will Fix Lockups When Systemd Units Read Lots of Files
Posted3 months agoActive3 months ago
phoronix.comTechstory
calmmixed
Debate
60/100
LinuxSystemdPerformance Optimization
Key topics
Linux
Systemd
Performance Optimization
Linux 6.18 will fix a performance issue causing lockups when systemd units read many files, sparking discussion on the root cause and potential workarounds.
Snapshot generated from the HN discussion
Discussion Activity
Active discussionFirst comment
23m
Peak period
15
0-2h
Avg / period
5
Comment distribution25 data points
Loading chart...
Based on 25 loaded comments
Key moments
- 01Story posted
Sep 27, 2025 at 4:26 PM EDT
3 months ago
Step 01 - 02First comment
Sep 27, 2025 at 4:49 PM EDT
23m after posting
Step 02 - 03Peak activity
15 comments in 0-2h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 28, 2025 at 10:22 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45399063Type: storyLast synced: 11/20/2025, 1:20:52 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
It's hard for me to imagine using it for anything myself, considering the number of times I do something like run a search (or a backup command) across literally every file I care about.
It's completely reasonable to turn it on. And also, when you're writing applications for Linux, consider using the `O_NOATIME` flag in your file opens.
Maybe if you could taint a process (or perhaps the inode and/or path instead) so that its and its children's opens get the new o_atime flag by default, so that systemd or whatever could set it for legacy processes (or files/paths) that need it.
Then distros or SRE's could put up with it without nagging all the SWEs about linuxisms. Some of whom may not know or care their code runs on linux.
As a former sysadmin through the dotcom booms, we regularly depended on atime for identifying which files are actively being used in myriad situations.
Sometimes you're just confirming a config file was actually reloaded in response to your HUP signal. Other times you're trying to find out which data files a customer's cgi-bin mess is making use of.
It's probably less relevant today where multi-user unix hosts are less common, but it was quite valuable information to maintain back then.
You can do that with bpf tooling now, for example the `opensnoop` BCC program can capture all file opens on demand. You can also write tools which capture all POSIX IOs to specific files/directories. I can see atime sometimes being useful in some super niche use cases such as hisenbugs you cannot reproduce reliably, but I would be reaching for BPF tools first.
https://www.redhat.com/en/blog/configure-linux-auditing-audi...
https://linux-audit.com/linux-audit-framework/configuring-an...
Most modern applications are not designed to operate on shared files like this so in general 'noatime' is safe for 99.9% of software.
Systemd in a way does. One of the systemd-tmpfiles entry option is to clean up unused files after some time (it ships defaults for /tmp/ after 10 days and /var/tmp/ after 30 days) and for this it checks atime, mtime and ctime to determin if it should delete the file (I think you can also take a flock on the file to prevent it from being deleted as well)
This indicates to me a very poor design. If not, it is a validation of the old UNIX saying "do one thing and do it well" and "keep programs small" (paraphrasing).
This isn't a systemd problem, systemd just makes use of cgroups. The kernel has a degenerate case handling lazy atime updates combined with cgroups.
I’d say it’s both a systemd issue and a kernel issue. The fact that systemd motivates kernel fixes does point to systemd being maybe just a bit overengineered
systemd is basically a victim here, you're quasi engaging in a tech form of victim blaming.
don't blame systemd for making use of kernel features (cgroups)
and without cgroups linux has no sandboxing capabilities, and would be largely irrelevant to today's workloads
Look if I wrote a thing that caused kernel lockups then I’d blame myself even if the kernel dudes fixed the issue
It’s a kernel bug.
https://youtu.be/Au15lSiAkeQ?si=sxxP2ia9vUkWY5qy&t=982
From the YouTube transcript:
"I don't know what systemd is doing to take so long cuz this is the rub systemd essentially takes 100% CPU twice over. So on our two core machine that we run these things on, I can run top that when I actually got it, I said to you the machine was unresponsive, right? Because all in kernel land, locks are being taken out left, right, and center. Um, you know, we're trying to mount these things in parallel at sensible levels because we want to try and mount"