Udp Isn't Unreliable, It's a Convertible (2024)
Posted2 months agoActiveabout 2 months ago
proxylity.comTechstory
calmpositive
Debate
60/100
UdpNetworkingProtocol Design
Key topics
Udp
Networking
Protocol Design
The article argues that UDP is not inherently unreliable, but rather a flexible protocol that can be used effectively with proper implementation, sparking a discussion on the trade-offs and use cases of UDP versus TCP.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
6d
Peak period
58
Day 6
Avg / period
12.2
Comment distribution73 data points
Loading chart...
Based on 73 loaded comments
Key moments
- 01Story posted
Oct 24, 2025 at 7:35 PM EDT
2 months ago
Step 01 - 02First comment
Oct 30, 2025 at 2:01 PM EDT
6d after posting
Step 02 - 03Peak activity
58 comments in Day 6
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 5, 2025 at 1:44 AM EST
about 2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45700165Type: storyLast synced: 11/20/2025, 2:09:11 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
UDP over the network does not on its own guarantee packet retries or proper delivery order. How reliable is that?
UDP on Linux to localhost or hair-pinned between two public IPs on the same host can result in reordered packets when the kernel is busy with a lot of traffic or the CPU is context switching enough. It happens when a UDP queue is handled by a certain core and gets swapped to another. I’ve had to set up two machines with a cable between them because the ordering is actually more stable than over localhost. A colleague is working on a kernel patch to bypass part of the UDP stack to restore proper ordering, which will probably need to be maintained in-house or at least hidden behind a kernel config knob upstream since this reordering is considered acceptable under the guarantees for UDP. How reliable is that?
So, yeah, you can say UDP is reliable but compared to TCP or QUIC it actually really isn't.
Again, the point is not to call UDP reliable, it's that it doesn't make sense for the reasons I stated above. If you choose the wrong tool for a usecase then the communication channel will be unreliable in either case. There's a lot of confusion about this -- it is valid to call a communication channel unreliable for the system to respond accordingly, not the protocol itself.
I don't, nor do I think it is unreliable. Protocols exist in an abstract space, but they are instantiated in a system that works within physical /constraints/ (noise, faults, oxidation, etc.). These constraints vary because different systems work in different environments.
The systems we make also have to provide utility for the /purposes/ that it's made. Reliability is necessary to trust the utility of a system.
If I'm playing a multiplayer FPS game and due to backoff I see players snapping back to the position of a lost packet once it is received, I will not think that the game is reliable (and I will be especially sad since this is known issue with known improvements).
If I don't care about old packets and only care about the latest packet, I don't work with old information -- but the information is incomplete. Latency will still be noticeable on a noisy connection, but my brain can at least anticipate -- guess where to look next -- once the last packet is received.
That incomplete information can be used to extrapolate movements. So the final step is to use a model to predict movements in (short) moments when a packet is not received. You can't do this when the information isn't relevant anymore. When that's done, I -- the player -- will feel like that game is more reliable than the first TCP case.
(Games, and real-time interfaces in general, have some interesting aspects that I think is well covered in Game Feel by Steve Swink).
The only real guarantee that TCP provides is in-order delivery. That is, it really guarantees that if you sent two requests A and B to a server, the server will never see request B unless it saw request A first. Again looking at HTTP, this is why request pipelining theoretically works without including any request ID in the response: the client knows that if it sent reqA first and reqB second, the first response it gets from the server will be respA, and the second will be respB (if either of them ever arrives, of course).
Note that the fact ACK is not shown to the application layer is not an implementation limitation, it is a fundamental part of the design of TCP. The ACK is generated by the TCP stack once the packet was received. This doesn't guarantee in any way that the application will ever see this packet: the application could crash, for example, and never receive (and thus act on) this message. The in-order guarantee works instead because the transport layer can simply avoid ever showing reqB to the application layer if it knows that there was a reqA before it that it didn't receive. So the transport can in fact ensure that it's impossible for the application to see out-of-order data, but it can't ensure that the application has actually received any data.
I never said UDP wasn’t sometimes fit for purpose. I said it’s unreliable compared to TCP. Those are not the same statement. Sometimes reliability isn’t part of the spec, so you do without that or build your own reliability around it if the less reliable option is still a better fit.
This is common in the industry, actually. The ‘I’ in RAID stands for inexpensive, because it’s a redundant array. We scale horizontally these days where we can because more cheaper servers can sometimes scale further and more reliably than fewer more premium servers. Heck, a paper cup is unreliable compared to ceramic or stainless steel, but Starbucks and Tim Horton’s move a lot of product in them.
Either that or it's slop that somehow made it to the front page.
My understanding is that in Linux, TCP on localhost (including packets sent to a public address on a local interface) bypasses the stack; I don't see why it would be a problem for UDP to do the same.
This is contrast with FreeBSD, where TCP to localhost is actually packetized and queued on the loopback interface, and you can experience congestion collapse on loopback if you do things right(wrong).
I show this more of as a discussion point than as a definitive answer. There is more research on this tool than others: https://www.nature.com/articles/d41586-025-02936-6
And I think it's interesting that it flags it as confidently AI generated. I also got a whiff of AI and I'm never sure how to take confirmation bias from AI detectors - though that said, I've gotten a whiff of AI before and had this detector say it's confidently human.
Reading this I kept waiting for the... point? I feel like the whole thing was more like saying "UDP is like a convertible because you can strap a tarp on top when it rains". Like... sure? But that tarp is going to be crappy compared to a real roof. And the idea that that tiny layer is "the best of both worlds" is frankly ridiculous to me.
Well, no, it's unreliable in the sense that if I ordered a package online, the courier may get lost on the way and never come with my package, and then I have to order it again if I want it delivered, and then hope that this time it actually does, in that case, it kind of does fail to do its job if you want things to be delivered.
It's only when you pass through a router that's specifically instructed "drop these first" that you run into drops.
Your chances are better but you can't assume packet loss or data errors can't happen just because the fiber is right in front of you.
It's far more likely if you get a bit flip that the layer 1 hardware will just drop the packet because it has a much better checksum that detected the error first. Even if you know every part of the hardware and software stack you have to plan for an occasional packet drop.
The statement
_Probably_ works out _most_ of the time in physical setups (especially in server environments with strict ECC memory), but it's not something you can write a protocol around if you need all of that data to _actually_ arrive.Talking about what UDP provides is a little bit like talking about what IP provides. It's almost a category error. Is "UDP" "reliable"? It's as reliable as whatever transport you build on it is!
In all fairness, though, there are quite a few application protocols which are built directly on top of UDP with no explicit intermediate transport layer. DNS, RTP, and even sometimes SIP come immediately to mind.
The RTP part will carry the "can you repeat that" part, unless you're sending DM's over SIP INFO messages.
TCP is also barely enough to be a real "protocol". It does a bunch of useful stuff but in the end raw text is pretty useless as a protocol. That's why everything from IRC to HTTP has its own protocol on top.
SCTP is a bit of a weird outlier to be placed at the level of TCP and UDP, but there's no need to go all QUIC and reinvent TCP if you want to use UDP. DNS, NTP, and plenty of other protocols don't need to rebuild reliability and streams to be useful.
UDP also isn't the only transport protocol used extensible either, it's just far more common because it has fewer assumptions. E.g. TCP has MPTCP and a bolt-on transport extension.
This isn't just a semantic argument. People getting this conceptually wrong has broken the deployment story for things like SCTP, which for no good reason rides on top of IP directly and thus gets blocked by middleboxes.
A classic practical example of "plain Jane UDP transport" might be an traditional SNMP trap (not one of the newer fancy flavors). No ACK, no bidirectional session setup, no message ID/sequence number, no retransmission, no congestion control - just a one shot blast of info to the given destination - split into datagrams and multiplexed to the receiving service via the transport layer over an arbitrary network layer.
A lot of things broke SCTP... but I'd rather not get into a side debate about what those reasons are. Just that it's not because UDP alone is unusable as a transport layer protocol.
> Consistent with the goal of minimizing complexity of the management agent, the exchange of SNMP messages requires only an unreliable datagram service, and every message is entirely and independently represented by a single transport datagram. While this document specifies the exchange of messages via the UDP protocol [11], the mechanisms of the SNMP are generally suitable for use with a wide variety of transport services.
From this, the authors intentionally kept things within a single datagram across any unreliable datagram service - UDP was just an obvious choice to define for the needs.
> In the text that follows, the term transport address is used. In the case of the UDP, a transport address consists of an IP address along with a UDP port. Other transport services may be used to support the SNMP. In these cases, the definition of a transport address should be made accordingly.
They continued to account for and allow for how other generic transport protocols could be used (at the time, not as many), rather than assume they only had the options of TCP or UDP.
> In cases where an unreliable datagram service is being used, the RequestID also provides a simple means of identifying messages duplicated by the network.
This shows other portions of SNMP did account for which specific features may need to be built on top of a minimal transport protocol and only added those to the specific PDUs which needed it. E.g. for this request based PDUs used by Get/GetNext/GetBulk etc they intentionally added an ID to handle message duplication, just not to every PDU, like traps, because it was unnecessary.
> A limited number of unsolicited messages (traps) guide the timing and focus of the polling. Limiting the number of unsolicited messages is consistent with the goal of simplicity and minimizing the amount of traffic generated by the network management function.
This shows the design of traps was heavily focused around simplicity and minimization of traffic, not based on what TCP or UDP could specifically offer. In fact, you won't find a mention of "checksum" or "hash" anywhere either in the RFC - UDP just had it as extra cruft on top of that generic "unreliable datagram service" they were designing against!
.
SNMPv3 did eventually add TCP as an option for traps a couple decades later, and hardly anyone ever opted to use it since as there really isn't much benefit from other transports for the use case. More have used the TLS option, but even more have just relied on the minimal, purpose defined encryption and HMACs added instead.
Thanks for this discussion by the way, there is nothing more I love working with or talking about than network protocol design and history :).
Not caring about whether every single transmission arrives is a totally legitimate technological choice and it can have some tangible benefits when it comes to throughput, latency etc.
As with every piece of technology UDP gets problematic when you are ignorant towards its limitations and misapply it.
I'd compare it to the mail. UDP is regular first-class mail. You send a letter, it probably gets to the address on the envelope. But it might get delayed, or even simply lost, and you'll never know.
TCP is like certified mail. It is sent by the same physical systems, but it is tracked and you get an acknowledgement that the recipient received it.
(It makes spoofing and amplification attacks easy to communicate, too.)
Talking up UDP like it's something special is part of the business strategy.
And, to be fair, a lot of TCP-based or even HTTP-based application protocols could probably have been UDP without any trouble.
* https://en.wikipedia.org/wiki/Datagram_Congestion_Control_Pr...
* https://en.wikipedia.org/wiki/Stream_Control_Transmission_Pr...
Curious to know if things can every move towards a point of allowing things. I know in many firewall 'languages' do have something like "allow tcp/443" or "allow udp/53" for nomenclature: perhaps something like "allow stream/443" and "allow dgram/53" to include (e.g.) TCP/SCTP and UDP/DCCP would allow more flexibility.
* https://datatracker.ietf.org/doc/html/rfc4336#section-2
The only thing I get from that document that answers this question to some extent is the extra overhead of the checksum field in UDP headers. In other places, they generally don't seem to actually discuss the option of building a QUIC-like transport, that is a transport-layer protocol that still exists on top of the UDP skeleton.
That said the old battle.net online and Quake/Unreal multiplayer system was a widespread implementation of UDP transport with reliability checks. Whenever your UDP stream desynced the game clock accelerated replaying the buffered moves back to back until you were back in sync.
https://www.youtube.com/watch?v=b8J7fidxC8s
Folks shouldn't necessarily argue "one is better" more so than they should consider all engineering aspects of why you should use one technology over another.
UDP is guaranteed to not be reliable.