Antislop: a Framework for Eliminating Repetitive Patterns in Language Models
Posted3 months agoActive3 months ago
arxiv.orgTechstoryHigh profile
calmmixed
Debate
70/100
Artificial IntelligenceLanguage ModelsGenerative AI
Key topics
Artificial Intelligence
Language Models
Generative AI
The 'Antislop' framework aims to eliminate repetitive patterns in language models, sparking a discussion on the definition and implications of 'AI slop' and the effectiveness of the proposed solution.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
20m
Peak period
50
0-3h
Avg / period
9.8
Comment distribution108 data points
Loading chart...
Based on 108 loaded comments
Key moments
- 01Story posted
Oct 23, 2025 at 12:36 PM EDT
3 months ago
Step 01 - 02First comment
Oct 23, 2025 at 12:56 PM EDT
20m after posting
Step 02 - 03Peak activity
50 comments in 0-3h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 25, 2025 at 4:19 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45683897Type: storyLast synced: 11/20/2025, 7:45:36 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
[1] Wikipedia
> AI slop is digital content made with generative artificial intelligence, specifically when perceived to show a lack of effort, quality or deeper meaning, and an overwhelming volume of production.[1][4][5] Coined in the 2020s, the term has a pejorative connotation similar to spam.[4]
[2] Urban dictionary
> Low-quality randomly generated AI content (images, accounts, text, etc) that has been flooding social media sites among other pages.
Yes, I know those may not be the best primary sources, but I'd say the main shared meaning of the word is lack of quality and effort, not repetitiveness itself.
[1] https://en.wikipedia.org/wiki/AI_slop
[2] https://www.urbandictionary.com/define.php?term=AI+slop
That's sloppy (hehe), if you are going to redefine a common word for the first time (i.e. references are not possible) at least do it explicitly.
Colloquially it means ‘poor quality’ and always has done. So buzzfeed is journalism slop, just like poor quality AI content is AI slop.
Yeah, slop is low effort use of AI output ("ChatGPT, write me a blog post about using AI in industry X. Copy. Paste. Publish."). If anything this is should be called Stealthslop, and when slop is harder to detect we'll all waste more time on it.
> Ethics Statement
> Potential harms include: [...] (ii) attempts to evade AI-text detection.
And it's not clear to me how their mitigations would avoid fooling users (as opposed to algorithmic detection attempts).
Lots of things have changed in that year, but the things that haven't are:
* So, so many em-dashes. All over the place. (I've tried various ways to get it to stop. None of them have worked long term).
* Random emojis.
* Affirmations at the start of messages. ("That's a great idea!") With a brief pause when 5 launched. But it's back and worse than ever now.
* Weird adjectives it gets stuck on like "deep experience".
* Randomly bolded words.
Honestly, it's kind of helpful because it makes it really easy to recognize content that people have copied and pasted out of ChatGPT. But apart from that, it's wild to me that a $500bn company hasn't managed to fix those persistent challenges over the course of a year.
Anecdotally, I use them less often these days, because of the association with AI.
Could be my own changing perspective, but what I think is interesting is how the signal it sends keeps changing. At first, emoji-heavy was actually kind of positive: maybe the project doesn't need a webpage, but you took some time and interest in your README.md. Then it was negative: having emoji's became a strong indicator that the whole README was going to be very low information density, more emotive than referential[1] (which is fine for bloggery but not for technical writing).
Now there's no signal, but you also can't say it's exactly neutral. Emojis in docs will alienate some readers, maybe due to association with commercial stuff and marketing where it's pretty normalized. But skipping emojis alienates other readers, who might be smart and serious, but nevertheless are the type that would prefer WATCHME.youtube instead of README.md. There's probably something about all this that's related to "costly signaling"[2].
[1] https://en.wikipedia.org/wiki/Jakobson%27s_functions_of_lang... [2] https://en.wikipedia.org/wiki/Costly_signaling_theory_in_evo...
Even when I create the first draft of a project’s README with an LLM, part of the final pass is removing those slop-associated patterns to clarify to the reader that they’re not reading unfiltered LLM output.
Why?!
What a great point! I also can’t stand it. I get it’s basically a meme to point it out - even South Park has mocked it - but I just cannot stand it.
In all seriousness it’s so annoying. It is a tool, not my friend, and considering we are already coming from a place of skepticism with many of the responses, buttering me up does not do anything but make me even more skeptical and trust it less. I don’t want to be told how smart I am or how much a machine “empathizes” with my problem. I want it to give me a solution that I can easily verify, that’s it.
Stop wasting my tokens and time with fake friendship!
Alcoholism can also be a symptom of a larger issue. Should we not at least discuss alcohol’s effects and what access looks like when deciding the solution?
I want the star trek experience. The computer just says "working" and then gives you the answer without any chit-chat. And it doesn't refer to itself as if it's a person.
What we have now is Hal 9000 before it went insane.
If AI wants to be useful (it's not going to atm), real people need to cull all the banalities that facebook, reddit & forums have generated.
Because what you're noticing is things we typically elide over in discussions with actual humans.
They could hide it so that it doesn't annoy you, but I think it's not a waste of tokens. It's there so the tokens that follow are more likely to align with what you asked for. It's harder for it to then say "This is a lot of work, we'll just do a placeholder for now" or give otherwise "lazy" responses, or to continue saying a wrong thing that you've corrected it about.
I bet it also probably makes it more likely to gaslight you when you're asking something it's just not capable of, though.
- how to cope with the sadness of losing their cat
- ranting about the annoying habits of their friends
- finding all the nice places to eat in a city
etc.
They do not want that "robot" personality and they are the majority.
I also recall reading a while back that it's also a dopamine trigger. If you make people feel better using your app, they keep coming back for another fix. At least until they realize the hollow nature of the affirmations and start getting negative feelings about it. Such a fine line.
Was this after many iterations? Try letting it get some "sleep". Hear me out...
I haven't used Codex, so maybe not relevant, but with Claude I always notice a slow degradation in quality, refusals, and "<implementation here>" placeholders with iterations within the same context window. One time, after making a mistake, it apologized and said something like "that's what I get for writing code at 2am". Statistically, this makes sense: long conversations between developers would go into the night, and they get tired, their code gets sparser and crappier.
So, I told it "Ok, let's get some sleep and do this tomorrow.", then the very next message (since the LLM has no concept of time), "Good morning! Let's do this!" and bam, output a completely functional, giant, block of code.
Human behavior is deeeeep in the statistics.
Maybe it's intentional, like the "shiny" tone applied to "photorealistic" images of real people.
I assume the beginning of the answer is given to a cheaper, faster model, so that the slower, more expensive one can have time to think.
It keeps the conversation lively and natural for most people.
Would be interesting to test if it's true, by disabling it with a system prompt, and measure if the time-to-answer is slower for the first word.
We are already at a point where we can trick large number of the population, it can without a doubt close the gap even further where we question anything and everything.
Beyond forensics, which require large capital investment and operating costs, to be able to detect AI vs human content will be limited in terms of access. It will be so that its not that we can't detect AI content anymore its that most people cannot afford the service to detect it and thus they lose interest.
This has side effect of making live performances by humans scarce and in valuable.
RIP take-home coding assignments.
Schools will need to reinvent themselves in some ways.
If an impersonation of an opera singer can't be distinguished from the real thing, what would be the point of the real thing?
https://www.reddit.com/r/LocalLLaMA/comments/1lv2t7n/not_x_b...
It's a new term so the meaning hasn't had a chance to settle. It's generally considered to be a negative term, so there's motivation for people to expand the definition to include things that they don't like. It is much easier to subvert a category than it is to make an argument for an individual item.
Imagine if people accept that falling rocks kill hundreds of people every year, and you wanted to convince them that falling cheese also kills plenty of people.
It would be much easier to imply that cheese, often coming in large roundish lumps, counts as a type of rock. It stretches the definition a bit but it's still much easier to argue than the actual falling cheese argument that is your actual agenda.
When the definition is new it is more malleable. Sometimes you might need a qualifier to declare it is different but imply it is essentially like the other thing. It's just a dairy-rock, or just enhanced-interrogation.
This approach is targeted to the kinds of mode collapse that we can meaningfully measure and fix after the fact, which is constrained to these verbal tics. Which doesn't fix higher level mode collapse on semantics & creativity that you're identifying -- but I think fixing the verbal tics is still important and useful.
I don't. I think they're useful for flagging the existence of mode-collapse and also providing convenient tracers for AI-written prose. Erasing only the verbal tics with the equivalent of 's/ - /; /g' (look ma! no more 4o em dashes!) is about the worst solution you could come up with and if adopted would lead to a kind of global gaslighting. The equivalent of a vaccine for COVID which only suppresses coughing but doesn't change R, or fixing a compiler warning by disabling the check.
If you wanted to do useful research here, you'd be doing the opposite. You'd be figuring out how to make the verbal expressions even more sensitive to the underlying mode-collapse, to help research into fixing it and raising awareness. (This would be useful even on the released models, to more precisely quantify their overall mode-collapse, which is poorly captured by existing creative writing benchmarks, I think, and one reason I've had a hard time believing things like Eqbench rankings.)
Searle's paper calls these questions, script, or a story.
[1]:https://web.archive.org/web/20071210043312/http://members.ao...
> Abstract: [...] Our approach combines three innovations: (1) The Antislop Sampler, which uses backtracking to suppress unwanted strings at inference time without destroying vocabulary; (2) An automated pipeline that profiles model-specific slop against human baselines and generates training data; (3) Final Token Preference Optimization (FTPO), a novel fine-tuning method that operates on individual tokens, surgically adjusting logits wherever a banned pattern has appeared in an inference trace.
From https://news.ycombinator.com/item?id=45546037#45585680 , an additional potential method:
>> Could build a simple heuristic: if similar memory content gets created/updated N times within short timeframe, flag it as potential loop
Oof—-gotcha here’s how I’d handle that
Clutch choice—-here’s a few refinements
Sweet—-let me just…
Ok, here’s the receipts
I love your passion! Let’s try to keep it civil ok?
(Thinking) the user still appears annoyed
—————————————-
I think this annoys them also and yet they can’t change it? Or are they not dogfooding?
2 more comments available on Hacker News