Chatgpt Discussion: in Non-Security Applications, Is Md5 Preferable?
Posted3 months agoActive3 months ago
chatgpt.comTechstory
calmmixed
Debate
40/100
CryptographyHash FunctionsMd5
Key topics
Cryptography
Hash Functions
Md5
The discussion revolves around whether MD5 is preferable in non-security applications, with participants weighing its performance against its known security vulnerabilities.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
38m
Peak period
2
2-3h
Avg / period
1.3
Key moments
- 01Story posted
Oct 10, 2025 at 2:41 AM EDT
3 months ago
Step 01 - 02First comment
Oct 10, 2025 at 3:20 AM EDT
38m after posting
Step 02 - 03Peak activity
2 comments in 2-3h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 10, 2025 at 5:01 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45535965Type: storyLast synced: 11/17/2025, 11:13:12 AM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
You kind of have to push LLMs to self-evaluate for good answers, although I am not sure what I would take from this on a technical level. I would use MD5 for non-security applications, I even think you could use it in a security context if you really know how collisions can be created and if that would interfere with your application of it. Better advice is to not do that of course.
Although thinking about it. What are non-security applications of hashes? Database indexing comes to mind, where a collision avoidance is the opposite you want. So what remains? For file integrity I would use SHA-2 something, but I don't see how MD5 would perform worse. Are there more obvious applications?
Perhaps the initial answer isn't really technically correct, but I wouldn't say it is bad advice.
For example in a distributed event based LAN chat, I used MD5 for an "integrity chain". Every new event id is the hash of the old event id + some random bytes. This way you can easily find the last matching event two systems have in common. Just a random id isn't enough, when two instances integrate an event from a third system, while one of the two added a new event just before that.
No security needed, speed doesn't matter much, it is not designed for high throughput. MD5 seems like a very good choice, because it's easy to work with and can be verified on every system.
But this is a security case that requires a hostile actor. If the problem is just checking for data integrity or in this case data identity without there being a danger for manipulation, MD5 should perform fine. I don't see a problem with your use case. I am no expert here and there are probably more optimal hashes, but MD5 has the advantage of being widely implemented in all kinds of systems.
Because understanding the intrinsic weakness of hashes isn't trivial, many just recommend "MD5 is broken, don't use it". I think this is just to be on the safe side. Many applications would probably be fine, but because to err on the side of caution is safer, people sometimes say that MD5 is the worst hash function ever conceived.