I Wish Ssds Gave You CPU Performance Style Metrics About Their Activity

Posted3 months agoActive3 months ago

ingve

92 points

54 comments

utcc.utoronto.caTechstory

calmmixed

Debate

60/100

SsdsNvmePerformance MetricsStorage

Key topics

Ssds

Nvme

Performance Metrics

Storage

The author wishes SSDs provided CPU-style performance metrics, sparking a discussion on the feasibility and potential solutions, including NVMe's existing log capabilities and standardized APIs.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

30m

Peak period

0-3h

Avg / period

5.4

Comment distribution54 data points

Loading chart...

Based on 54 loaded comments

Key moments

01Story posted
Oct 19, 2025 at 1:13 PM EDT
3 months ago
Step 01
02First comment
Oct 19, 2025 at 1:42 PM EDT
30m after posting
Step 02
03Peak activity
30 comments in 0-3h
Hottest window of the conversation
Step 03
04Latest activity
Oct 21, 2025 at 4:54 AM EDT
3 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (54 comments)

Showing 54 comments

jauntywundrkind

3 months ago

2 replies

One of the big innovations of NVMe over SATA was giving us a bunch of separate command queues. It'd be lovely to get some per queue information.

I feel like maybe some of this info is already available we just don't commonly look at it: knowing how deep the queue is, how many commands are outstanding at any given moment is probably a decent start. I haven't spent time digging into blk-mq to see what's available, to understand the hardware dispatch queue (how the kernel represents the many hardware queues available) info. https://www.kernel.org/doc/html/v5.16/block/blk-mq.html

adgjlsfhk1

3 months ago

2 replies

I think a lot of the problem is that the ssd is fast enough and far enough away that by the time you get a response back it's fairly out of date

dleary

3 months ago

That’s a separate concern.

Every command that you issue to the ssd returns a response. It would be nice to have a bunch of performance counters that tell us where the time went with each of the commands we give it.

GPUs have this already.

wmf

3 months ago

This was solved in networking by having the device collect a histogram of queue occupancy.

fh973

3 months ago

Queues tend to be always full or always empty (see queueing theory). There is no steady state with a half full queue.

For NVMe in particular you will have a hard time filling their queues. Your perceived performance is mostly latency, as there is hardly an application that can submit enough concurrent requests.

andai

3 months ago

11 replies

This reminds me of not too long ago when you could hear the sound of the spinny disk in action, and you'd know if there was an issue (e.g. low on RAM and swapping a lot, or the dreaded Windows search indexer).

You get many of the same problems these days, but they're a bit harder to diagnose. You have to go looking at system monitors to see what's going on. Whereas, if the computer just communicated to you what it was doing, in an ambient way, this stuff would be immediately obvious.

I've heard stories like this where people worked on older computers that were loud, and then you could actually hear what it was doing. If it got stuck in an infinite loop, you'd literally hear it.

That seems like very much a feature to me.

buildbot

3 months ago

2 replies

You can hear some modern GPU/CPUs (well really their power electronics) when they get heavily loaded!

With training runs it makes a little beat and you can tell when it checkpoints because there’s a little skip. Or a GPU drops off the bus…

dataflow

3 months ago

1 reply

> You can hear some modern GPU/CPUs (well really their power electronics) when they get heavily loaded!

I'd hope you hear their fans too...

tomsmeding

3 months ago

1 reply

Sure, but that has a resolution of seconds at best. The coil whine in the power electronics is milliseconds-accurate.

wpm

3 months ago

Yeah I get a nice reminder to limit frame rates when I hear the atrocious coil whine from my 4090 as it renders 1500fps of a static loading screen.

First world problems.

JohnBooty

3 months ago

I live in an old house. When weather permits, I work in the detached garage.

When doing some AI stuff on my garage PC (4060 Ti; nothing crazy) the overhead lights in the garage slightly but noticeably dim. This doesn't occur when gaming.

It's most easily noticeable with one of nVidia's demo apps -- "AI Paintbrush" or something like that, I forget. It's a GUI app where you can "paint" with the mouse cursor. When you depress the mouse button, the GPU engages... and the garage lights dim. Release the mouse button, and the lights return to normal.

ssl-3

3 months ago

2 replies

PCs used to be pretty noisy even in the 90s.

The drives were numerous (hard, floppy, tape, optical), and the noises were too loud to avoid using diagnostically. Printers clacked and whooshed (and sometimes moved furniture). Scanners sang songs. Monitors produced clicks and pops and buzzes and sizzles, and the flyback transformer would continuously whine at different frequencies depending on mode. Modems made dialing and shrieking noises. Sound cards were anything but silent; a person could hear noises that varied based on the work the system was doing. And for a long while, CPUs and/or front side bus speeds put a lot of noise right in the middle of the FM dial.

Computing is pretty quiet these days.

mulmen

3 months ago

1 reply

90s? I had all of the listed devices well into the 2010s.

ssl-3

3 months ago

1 reply

During the 2010s: I was very done with floppy, and tape, and nearly done with optical media. My laptop no longer had a modem built-in and it took me months to notice this. I gave up on printing expensive color images at home (and began ordering inexpensive dye-sub or photographic prints), and laser printers (that could print any color desired as long it was black) were cheap and quiet and most of the surviving "old" examples were new enough to no longer smell strongly of ozone; the reciprocating print mechanisms of yore were simply gone. The scanner no longer sings. Essentially-silent LCD monitors had replaced the CRTs. Internal sound cards had become quite good at being silent, and during that time also became excellent at being irrelevant. SSDs became common (and big/cheap enough to use) on most normal systems. Even cooling fans were getting quieter, probably thanks to the combined effects of the introduction of standardized PWM and Noctua's influence (both in 2005): By the 2010s, building a very quiet PC was no longer the dark art it had been in parts of the 90s.

At least in my world, the sound of computing had changed quite a bit over the span of decades from the 90s to the 2010s.

The only incidentally-noisy computing things I had left at the end of the teens were the hard drives of ever-increasing size that got used for storing Linux ISOs.

mulmen

3 months ago

Fair. I didn't have tape in the 2010s. I definitely had archives on floppy in the 2010s but by the end of the decade I was done with them. But only in the last couple years has my desktop become fully solid state.

hulitu

3 months ago

> PCs used to be pretty noisy even in the 90s.

They are still noisy when doing real work on them. Especially laprops.

otras

3 months ago

3 replies

I remember learning about the complex pumping machines running some of the reservoir pumps in Boston (https://en.wikipedia.org/wiki/Metropolitan_Waterworks_Museum), where they made such distinct noises when working (and malfunctioning) that an engineer could diagnose the problem by ear.

I sometimes think about what a modern analogy would be for some of the operations work I do — translate a graph of status codes into a steady hum at 440hz for 200s, then cacophonous jolts as the 500s start to arrive? As you mentioned, no perfect analogy as you get farther and farther from moving parts.

jychang

3 months ago

1 reply

You can try running LLMs on your own computer!

They have extremely distinct sounds coming from the GPUs. You can hear the difference between GPT-OSS-20b and Qwen3-30b pretty easily just based on the sounds that the gpu is making.

The sound is being produced by the VRMs and power supply to the GPU being switched on and off hundreds of times per second. Each token being produced consumes power, and each attention and MLP layer consumes a different amount of power. No other GPU stress test consumes power in the same way, so you rarely hear that sound otherwise.

vrighter

3 months ago

2 replies

this. I was running a reinforcement learning training run. I could very clearly hear from the coil whine whether it was simulating or backpropagating

andai

3 months ago

That is so cool. My computer isn't loud enough though. I think I'll have to install a guitar pickup. TEMPEST@HOME!

(I've also gotten great use out of a $5 AM/FM radio.)

LargoLasskhyfv

3 months ago

Hrrm. Maybe up to 25 years ago, but certainly 30 years ago you had similar phenomena via FM-Radio. Depending on what you did, there were different interferences in the radio. Unzipping something made different sounds than compiling, running a raytracer, or zooming into fractals.

One could use that while half asleep in the bedroom, whith a radio tuned into the right frequency, almost muted, and then know if Portage on Gentoo, or build.sh/pkgsrc on NetBSD was ready, or interrupted.

Because no buzzing or humming anymore :-)

andai

3 months ago

Sound is a good cue to problems. In one place I worked, we had a big board of dials showing what was happening to our web servers. The hands were moved by little servomotors that made a slight noise when they turned. I couldn't see the board from my desk, but I found that I could tell immediately, by the sound, when there was a problem with a server.

https://www.paulgraham.com/popular.html

SoftTalker

3 months ago

Cars are a pretty common example. Any new noises or changes in noises are indicating something. Usually a developing problem. E.g. a groaning or roaring noise, especially in turns, that varies with speed, is likely a worn out wheel bearing.

antisthenes

3 months ago

1 reply

> You get many of the same problems these days, but they're a bit harder to diagnose.

Luckily, storage also get incredibly cheap, so instead of diagnosing it's easier to just have a full backup of your data, and swap to it in case something goes wrong.

bayindirh

3 months ago

Malfunctioning devices are small part of the issue. One can understand how the workload strains the system via the sounds it (used to) make.

Graphs and logs provide a proxy to that data at best, and attaching a debugger, tracer, or perf tool is not an option all the time.

Sounds and LEDs provided an overhead-free real time communication channel to the operation of the system.

userbinator

3 months ago

2 replies

Drive activity lights are also useful especially with an SSD, but they seem to be gone from most if not all laptops these days. Part of me wonders if that was a deliberate decision to hide activity which users may not want.

Bender

3 months ago

Part of me wonders if that was a deliberate decision to hide activity which users may not want.

Possibly. My first 386-DX40 had activity lights and I tried out a CompuServ disk and saw my HD activity going nuts so I killed the power and trashed the CD.

There are programs that can show a virtual LED for HD and Network activity so all is not lost.

hulitu

3 months ago

> Part of me wonders if that was a deliberate decision to hide activity which users may not want.

No. Just removing of parts to increase profits.

eurleif

3 months ago

2 replies

SSDs (many of them at least) actually do make little noises when they're busy! I noticed my PC was making a noise, and I went on a wild goose chase trying to track it down. (Was something wrong with one of the fans? Was it coil whine from the GPU?) I didn't immediately suspect the SSD, because everyone claims they're silent. Then I finally realized the noise corresponded with disk activity, and I found a YouTube video confirming it: https://www.youtube.com/watch?v=KS-BHI667po

nashashmi

3 months ago

I could hear those same sounds in my blackberry. It was surreal. I put it close to my ear and could hear it work.

1718627440

3 months ago

Yes, but that noise has a frequency that is way higher. It is the same as with new motors/electric motors. While it is technically true, that the loudness of the sound is less, they are more annoying, so you trade a fine sound that also serves as an easy diagnostic, to a obtrusive noise, that also conveys less semantics.

dr_kiszonka

3 months ago

It is clearly not the same experience, but it is possible to emulate these sounds with software like DiskClick (Win, Linux, maybe MacOS).

CaptainOfCoit

3 months ago

This is coming back now it seems, as the last three GPUs I've had all had coil whine which is distinct per activity. When I'm doing some processing sequentially across 3 different LLMs, I can hear based on the type of coil whine which LLM is currently doing the inference.

matsemann

3 months ago

I've experimented with sound for some debugging. Like, something that makes a soind every time a log line is emitted. Or every time the browser repaints something. Like a Geiger counter, can hear when something is off.

Alive-in-2025

3 months ago

I remember when you could hear the fan on your pc and you knew when the os crashed because the fan would spin up to 100% when the cpu go into some kind of short path infinite loop. That spinning fan sound still alerts me 30 years later.

SoftTalker

3 months ago

My dad used to tell me that the first computers he programmed had a front panel with toggle switches and LEDs showing the binary content of the program counter and some other status values and he could tell by the activity on the LEDs whether the program was running normally.

zokier

3 months ago

1 reply

Doesn't nvme have lot of log capabilities, like telemetry log etc?

wmf

3 months ago

Consumer NVMe probably doesn't have that. It has SMART which is hard to interpret.

cjensen

3 months ago

2 replies

This is asking too much. The management of trim, reallocation, wear leveling, and so much more is very complex. It's a full software stack hiding behind the abstraction of NVMe. Every manufacturer is running a different stack with different features and tradeoffs. The "stats" the author is asking for would be entirely different between manufacturers, and I doubt there is that much to be gained from peering behind the curtain.

zekrioca

3 months ago

1 reply

It has been done previously for CPUs, which are much more complex than SSDs. Why couldn’t each manufacturer expose whatever performance metrics there are, in whichever way they want (as the post argues, eg., through SMART), and then let system engineers exploit this information to optimize their use-cases?

jeffbee

3 months ago

1 reply

Seems like a poor example since CPU performance metrics differ not only between ISAs, and between vendors of one ISA (AMD vs. Intel, for example) but also between items from a single vendor. There's a 1000-page PDF that tries to explain what all the Intel PMU counters mean on different CPUs and it's full of errors and omissions as well.

zekrioca

3 months ago

Yes, but these differences don’t really matter. There are multiple techniques that system engineers can use to perform both variable selection and regularization (to help with differences across multiple architectures) to help them select counters that matter for their specific use cases.

But then saying “it is too much to ask” is just another way to limit what user can do with the specific resources they paid for.

jeffbee

3 months ago

1 reply

The abstraction is the problem. Get rid of the translation layer, manage flash directly in the operating system, and suddenly the ambiguity dissolves. You would get meaningful, uniform statistics with semantics necessarily matching those used by your operating system.

ssl-3

3 months ago

2 replies

Do I really want my relatively expensive general-purpose CPU to be burdened with the task of managing flash using software, when a relatively inexpensive ASIC does that job very quickly and efficiently?

There's a lot of non-trivial stuff that goes on inside of a modern SSD. And to be sure, none of it is magic; all of it could certainly be implemented in software.

But is that kind of drastic move strictly necessary in order to get meaningful statistics?

adgjlsfhk1

3 months ago

1 reply

ssds aren't using Asics. they're full blown computers. Apple has moved to ssd control on soc and it seems to work for them.

ssl-3

3 months ago

1 reply

Apple moving SSD control to a hardware block in their own custom chip is not the same thing as implementing the functionality using software.

(You've heard about apple and orange comparisons, right? Right.)

1718627440

3 months ago

1 reply

I would call that thing running on that custom chip software.

ssl-3

3 months ago

1 reply

You don't need me or anyone else to tell you that you're free to call it whatever you want.

I'm going to keep referring to the QuickSync video encoding block in my CPU as "hardware," though, because the tiny lump of transistors that is dedicated to performing this specialized task is something that I can kick.

Relatedly, the business of managing raw NAND storage on Apple devices and abstracting it to operating system software as NVMe: That translation happens in hardware. That hardware is also something that I can kick, so I'm going to keep calling it "hardware".

kasabali

3 months ago

QuickSync isn't analog to an SSD controller. One is a specialized IP block that handles video streams, other is a generic ARM or RISC core running a specific software for handling low level NAND operations.

jeffbee

3 months ago

There would be other benefits, such as reduced write amplification and better workload isolation. The observability would just be gravy.

tanelpoder

3 months ago

1 reply

Indeed, would be nice if there were a standardized API/naming for internal NVMe events, so you'd not have to look up the vendor-specific RAW counters and their offsets. Somewhat like the libpfm/PerfMon2 library for standardized naming for common CPU counters/events across architectures.

The `nvme id-ctrl -H` (human readable) option does parse and explain some configuration settings and hardware capabilities in a more standardized human readable fashion, but availability of internal activity counters, events vary greatly across vendors, products, firmware versions (and even your currently installed nvme & smartctl software package versions).

Regarding eBPF (for OS level view), the `biolatency` tool supports -F option to additionally break down I/Os by the IORQ flags. I have added the iorq_flags to my eBPF `xcapture` tool as well, so I can break down IOs (and latencies) by submitter PID, user, program, etc and see IORQ flags like "WRITE|SYNC|FUA" that help to understand why some write operations are slower than others (especially on commodity SSDs without power-loss-protected write cache).

An example output of viewing IORQ flags in general is below:

https://tanelpoder.com/posts/xcapture-xtop-beta/#disk-io-wai...

Avamander

3 months ago

It's not only NVMe/SSD that could use such standardization.

If you want detailed Ryzen stats you have to use ryzen_monitor. If you want detailed Seagate HDD stats you have to use OpenSeaChest. If you want detailed NIC queue stats there's ethq. I'm sure there are other examples as well.

Most hardware metrics are still really difficult to collect, understand and monitor.

srean

3 months ago

I have had a long held, far far simpler wish that has remained out of my reach since ever -- can SMART implementation by vendors be not so half assed.

I have had the wish since the days of spinning disks.

LebanonJon

3 months ago

Most recent NVMe SSDs support OCP NVMe spec which has c0 log page. Can just get from standard NVMe cli OCP plugin. https://www.opencompute.org/documents/datacenter-nvme-ssd-sp...

View full discussion on Hacker News

ID: 45635870Type: storyLast synced: 11/20/2025, 2:49:46 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN