I Miss Using Em Dashes
Key topics
The author laments the decline of using em dashes in writing due to their association with AI-generated content, sparking a debate among commenters about the significance of em dashes and the impact of AI on writing styles.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
46m
Peak period
152
Day 1
Avg / period
39.3
Based on 157 loaded comments
Key moments
- 01Story posted
Sep 1, 2025 at 8:20 PM EDT
4 months ago
Step 01 - 02First comment
Sep 1, 2025 at 9:06 PM EDT
46m after posting
Step 02 - 03Peak activity
152 comments in Day 1
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 10, 2025 at 5:52 PM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
But I agree that triple em-dash for pause is not half bad either. I could see it becoming a thing, with how it goes the opposite direction and is so over the top :)
Think text much more likely from robot than first thought
Grug say this change too big from just one em dash
This is necessary nuance that I'll have take into consideration. Thank you.
In the 20th century, there were two spaces after an end of sentence period. (I still do that.)
[1] https://en.m.wikipedia.org/wiki/Non-breaking_space (see specifically the example section)
Only if you used a typewriter. I was using (La)TeX in the twentieth (1990s), and it defaulted to a rough equivalent of 1.5 spaces (see \spacefactor).
Two ('full') space characters were added because of (tele)typewriters and their fixed width fonts, and this was generally not used in 'properly' published works with proportional typefaces; see CMoS:
* https://www.chicagomanualofstyle.org/qanda/data/faq/topics/O...
Bringhurst's The Elements of Typographic Style (§2.1.4) concurs:
* https://readings.design/PDF/the_elements_of_typographic_styl...
* https://webtypography.net/2.1.4
* https://en.wikipedia.org/wiki/The_Elements_of_Typographic_St...
I could never agree with this, because monospace fonts are already adding extra space with the dot character, which is much narrower in proportional fonts. That fact alone makes the visual gap already similarly wide as it would be in typeset proportional text. Adding a second space makes it much too wide visually (almost three positions wide). It looks like badly typeset justified text.
(I understand why people are doing it, I just don’t agree on aesthetic grounds.)
[1]https://en.m.wikipedia.org/wiki/Swastika
Which is to say, we all compromise.
I'd hate to lose my em- and en-dashes, but the original post seems to misuse en-dashes where hyphens belong (it could just be a font issue, but no matter).
I keep punctuating like it's the 18th Century, myself;—compound points are my favorites:—like the colon-dash compound, AKA the "dog's bollocks."
Eventually, as models and their users both improve, we'll collectively realize that trying to reliably discriminate between AI and human writing is no different than reading tea leaves. We should judge content based on its intrinsic value, not its provenance. We should call each other out for poor writing or inaccurate information — not because if we squint we can pick out some loose correlations with ChatGPT's default output style.
Consciously trying not to "sound like an LLM" while writing is like consciously trying not to think about the fact that you're currently breathing, or consciously trying to sound like a cool guy.
I don't use AI in my writing. If I were still in school would I be tempted? Probably. But in work and personal writing? Never crosses my mind.
The stakes are a bit different for students unfortunately, who who’ll have their writing passed through some snake oil AI detector arbitrarily. This is unfortunate because “learning how not to trigger an AI detector” is a totally useless skill.
Generally, I don’t think we need AI detection. We need dumb bullshit detection. Humans and LLMs can both generate that. If people can use an LLM in a way that doesn’t generate dumb bullshit, I’m happy to read it.
There are zillions of words produced every second, your time is the most valuable resource you have, and actually existing LLM output (as opposed to some theoretical perfect future) is almost always not worth reading. Like it or not (and personally I hate it), the ability to dismiss things that are not worth reading like a chicken sexer who's picked up a male is now one of the most valuable life skills.
Of course there are cases where you can tell that some text is almost certainly LLM output, because it matches what ChatGPT might reply with to a basic prompt. You can also tell when a piece of writing is copied and pasted from Wikipedia, or a copy of a page of Google results. Would any of that somehow be more worth reading if the author posted a video of themselves carefully typing it up by hand?
1: You're assuming a specific type of output in a specific type of context. If LLM output were never worth reading, ChatGPT would have no users.
Having good heuristics to make quick judgements is a valuable life skill. If you don't, you're going to get swamped.
> Would any of that somehow be more worth reading if the author posted a video of themselves carefully typing it up by hand?
No, but the volume of carefully hand-typed junk is more manageable. Compare with spam: Individually written marketing emails might be just as worthless as machine-generated mass mailings, but the latter is what's going to fill up your inbox if you can't filter it out.
> If LLM output were never worth reading, ChatGPT would have no users.
Only if all potential users were wise. Plenty of people waste their time and money in all sorts of ways.
What we're discussing is whether a set of heuristics to determine whether something is the output of a human or an LLM are "stupid", not whether we should put LLMs in charge of critical work. It's not the LLMs we're talking about, but the heuristics to detect them.
The person you initially replied to claimed the heuristics (to detect LLMs) are not stupid if they are shown to work (by detecting LLMs). The person they were replying to claimed such heuristics were useless.
Why do you think it's not a good heuristic to be able to quickly spot the tell-tale signs of LLM involvement, before you've wasted time reading slop?
Yes, there will be false positives. It's a heuristic after all.
If anything, I'd rather that renderers like Markdown just all agree to change " - " to an en dash and " -- " to an em dash. Then we could put the matter to bed once and for all.
Citation needed.
> Who is it helping if we collectively bully ourselves into excising a perfectly good punctuation mark from human language?
Humans can adapt faster than LLM companies, at least for the moment. We need to be willing to play to our strengths.
Who is it helping if we bully ourselves into ignoring a simple, easy "tell"?
https://en.wikipedia.org/wiki/Dash
Humans can adapt faster than LLM companies
No one said anything about LLM companies. If I were a spammer today, I'd just have my code replace dashes in LLM output with hyphens before posting it. As a human, I'm not going to suddenly stop using dashes because a handful of people are treating a silly meme as if it were a genuinely useful heuristic.
That maybe backs up the claim that it's standard, but not that it's widely used or the false positive rate would be unacceptably high.
> If I were a spammer today, I'd just have my code replace dashes in LLM output with hyphens before posting it.
No you wouldn't, for the same reason spammers don't put more plausible stories in their emails: they want to filter for the most gullible segment before investing any human effort.
I was just curious why you've decided paying attention to them is a bad heuristic. Sure, it can change once people instruct their LLMs not to use them, but still, for now, they sure seem to overuse them!
That and "let's unpack this". I swear, I'll forbid ChatGPT from using "unpack" ever again, in any context!
So the only real purpose of the heuristic is to add a tiny extra vote of confidence when I see a comment that otherwise appears to be lazy ChatGPT copypasta, but in such cases I'll predict that it was probably LLM output either way, and I'll judge that it appears to be poor writing that isn't worth my time regardless of whether or not an LLM was involved.
Fundamentally, the issue I'm seeing here is that we're all talking over each other because we need a better standardized term than "LLM output". I suppose "slop" could work if we universally that it referred only to a subset of LLM output, rather than being synonymous with LLM output in general, but I'm not sure that we do universally agree on that.
If someone types the equivalent of a Google search into ChatGPT, or a spammer has an automated process generically reply to social media posts/comments, that's what qualifies to me as "slop". Most of us here have seen it in the wild by now, and there's obviously a distinctive common style (at least for now), and I think we can all agree that it sucks. That's very different from someone investing time and/or expertise to produce content that just happens to involve an LLM as one of the tools in their arsenal; the attitude it isn't is just the modern equivalent of considering cellular phone calls or typed letters to be "impersonal".
I'm not suggesting that LLM output doesn't tend to have a higher density of em dashes than human output. I'm just pushing back on the idea that presence of em dashes is sufficient evidence to dismiss something as probably-LLM-generated, which is no better than superstition. I mean, I've used em dashes in a number of comments in this thread, and no one has accused me of using an LLM, so it can't be a pattern that anyone puts too much stock in.
It seems like you’re just wrong here? Em dashes aside, the ‘style’ of llm generated text is pretty distinct, and is something many people are able to distinguish.
If organizations like schools are going to rely on tools that claim to detect AI-generated text with a useful level of reliability, they better have zero false positives. But of course they can't, because unless the tool involves time travel that isn't possible. At best, such tools can detect non-ASCII punctuation marks and overly cliched/formulaic writing, neither of which is academic dishonesty.
Additionally, I do think it is valuable to determine if a piece of text is valuable, or more precisely, what I’m looking for. As others have said, if I want info from a LLM about a subject, it is trivial for me to get that. Oftentimes I am looking for text written by people though.
I was basing that on a few factors, off the top of my head:
1. Someone might pick up mannerisms while using LLMs to help learn a new language, similarly to how an old friend of mine from Germany spoke English with an Australian accent because of where she learned English.
2. Lonely or asocial people who spend too much time with LLMs might subconsciously pick up habits from them.
3. Generation Beta will never have known a world without LLMs. It's not that difficult to imagine that ChatGPT will be a major formative influence on many of them.
As others have said, if I want info from a LLM about a subject, it is trivial for me to get that.
Sure, it's trivial for anyone to look up a simple fact. It's not so trivial for you to spend an hour deep-diving into a subject with an LLM and manually fact-checking information it provides before eventually landing on an LLM-generated blurb that provides exactly the information you were looking for. It's also not trivial for you to reproduce the list of detailed hand-written bullet points that someone might have provided as source material for an LLM to generate a first draft.
I think nobody is upset about reading an LLM's output when they are directly interacting with a tool that produces such output, such as ChatGPT or Copilot.
The problem is when they are reading/watching stuff in the wild and it suddenly becomes clear it was generated by AI rather than by another human being. Again, not in a context of "this pull request contains code generated by an LLM" (expected) but "this article or book was partly or completely generated by an LLM" (unexpected and likely unwanted).
I like how this is presented as a given thing that will happen, that models are going to just improve forever. That there isn’t some plateau on “user skill with LLMs” like it’s fucking calculus mixed with rocket science that only the elite users will ever attain full fluency in using.
This is starting to read like religious cult propaganda, which is probably scarier than whatever else ends up happing with this shit.
Google fu, before Google fucked everything up, was not hard to learn, and then plateaued. It’s not like it was hard to do, and it’s not like once you figured it out there was this boundless growth potential to keep learning. You learned algebra one. Congrats, that’s all there was to it.
Tech literacy I’m not even going to address, because again either you don’t understand what that means or you’re being intentionally obtuse.
Assuming prompt engineering is hard, assuming that LLMs are going to continue to make any kind of substantial leap without _any_ evidence other than blind faith, is as close to believing in a religion as it gets. Having blind “faith” in this house or cards, saying things like “when things continue to advance” without any evidence that there will be any advancement, is absolutely insane.
I’m having a hard time believing you needed me to spell that out.
An 80-year-old who barely uses computers and still types full sentences into Google (or who struggled for years to unlearn that habit) might find LLMs hard to master. Someone with poor written communication skills might find LLMs hard to master. Shockingly, it turns out that different people have different skills and life experiences.
I never used the word "faith". I'm not sure why you feel the need to make up a straw man to attack rather than respond to my comment as written, or why you feel the need to repeatedly insult me and accuse me of bad faith. It sounds like you're more interested in trying to win some perceived argument than engaging in constructive discussion.
> Eventually, as models and their users improve
You’re pushing opinion and assumption as fact. Stop doing that.
Do you honestly believe that the LLM tech landscape and end user competency with them will both look exactly the same in 2050 as they do in September 2025? You don't think the codebases of social media spambots will at least have become sophisticated enough to avoid copying the default writing style of a basic ChatGPT response? This is a very conservative prediction. Based on the vitriol you've been responding with, one would think I'd written that AGI was around the corner and anyone who disagreed with me was an idiot.
But the people you will reach online will be online, and not some random person-off-the-street. The average person on the street will give the same blank stare on the topic of compilers, regular expressions, black-holes, or robotics, but I still want to read about those topics. And if I want an LLM's take on those topics, everyone knows where to turn to get that.
I think there is a very interesting discussion to be had over how LLMs are actively changing the way we write, or even speak.
"delve" was a red flag 650 years ago!
When Adam delved and Eve span, who was then the gentleman? — Fr John Ball's sermon addressing the rebels of the Peasant's Revolt, 1381
I only have a limited amount of time to read. Skipping someone's Internet comment because it looks like spam often means I get to engage with something else.
If someone who typically bills $500/hr spends 30 - 60 minutes on a comment or blog post, that's still $250 - 500 worth of their time invested regardless of whether or not an LLM was involved. An LLM is comparatively cheaper than hiring a human editor or research assistant, but it's not negative cost.
Likewise, prompting ChatGPT with "write a blog post about bees" may be cheaper than hiring someone off Fiverr to respond to the exact same prompt, but in either case the resulting content will be low-value (yet still higher-value than the string "write a blog post about bees") because its source material was cheap. The fact that the latter version would have been written by a human is incidental.
Compose --- should produce —
For en dash it's
Compose --. produces –
Not all fonts show the difference though.
A lot of people are comfortable using the dot, the comma, and maybe exclamation marks.
AI-speech seems to strive for more formal writing by default.
An em dash that’s not a sudden interruption shouldn’t have any spacing around it.
It is an interruption to me and I think that little pause is intentional. if the author wants no pause they should have used parentheses
Also, the AP Style guide is hardly relevant when it comes to most writing—especially creative writing.
Maybe I’ll take a short pause in a sentence–or show a huge range 0 — 999.
(On the other hand, maybe it's just low-paid writers in South Africa: https://www.theguardian.com/technology/2024/apr/16/techscape... )
When you are talking, an aside can make a lot of sense because you are thinking and speaking in real time. When you write you have the luxury of time to reformulate your words more precisely. Em dashes are best kept for prose that mimics speech rather than constructing logical text.
It's no coincidence that em dashes are rare in legal texts because they are too imprecise. Where as semicolons are extremely common in legal texts.
The S in semicolon stands for S-Tier. Maybe the E in em dash stands for E-Tier?
lolz
I believe you meant "not the no-talent ass clown".
This video will be too voluminous or intrusive to be viewed manually, so it will be analyzed by (you guessed it) AI to determine if the work was authentic.
It will probably be developed and required by the corrupt education industry, but perhaps some writers will voluntarily use it to buy authenticity or stand out. But either way, the machine will once again find another way to take our agency and make our lives less enjoyable.
https://en.wikipedia.org/wiki/Eats,_Shoots_%26_Leaves
Remember, meaning is based on common usage, so now em dash is slop-nonymous, semicolons can take on a more casual vibe.
For example: I love pizza — it's my comfort food.
Can just become: I love pizza; it's my comfort food.
For asides: I love pizza — especially pepperoni.
Can just become: I love pizza (especially pepperoni).
Other than splitting infinitives and ending sentences with a preposition, of course. They are a weighty burden no soul should have to ever put up with.
I grew up online in teletype and ADM5. To some extent, my sense of how text presents is dominated by monotype/fixed-width and em-dashes just never worked in that 7 bit world.
Two hyphens is too much. one hyphen is not enough.
> not chatgpt output—i'm just like this.
> https://xkcd.com/3126/ — Disclaimer
The key to do it without the LLM stigma is surrounding it with spaces which still doesn't violate typical writing rules.
> “This was not just X; it’s really Y”
Here are some real examples taken from various sources:
> "Regenerative businesses don't just minimise harm; they actively create positive change for the environment and people."
> "This milestone isn’t just about our growth. It’s about deepening our commitment to you…"
> "This wasn’t just a market rally. It was a real-time lesson in how quickly sentiment can fracture and recover when fundamentals remain intact."
Hard to say for certain that this is AI slop, but just like em dashes, I see it routinely pop up in LLM prose. And I feel like it’s infected nearly everything I’ve read that was written within the last year.
Sam Altman more than anyone else popularised this style and for a while every thrid or fourth comment on any AI-related topic was all lowercase.
Anyway, I've used em- and en-dashes for a long time now (had it built into keyboard layout on "3rd level", AltGr+key), and not going to stop now.
Getting your knickers in a twist over a minor typographical construction is rather contrived as an indicator of a non human author of a text. It will do for now but won't tomorrow.
I can mostly spot LLM output on sight but I can be fooled. I never use silly rules like "em-dash => LLM". That's just silly.
I don’t use the word “delve” anymore, however.
4 more comments available on Hacker News