Back to Home11/12/2025, 7:05:41 PM

GPT-5.1: A smarter, more conversational ChatGPT

554 points
721 comments

Mood

excited

Sentiment

positive

Category

tech

Key topics

AI

ChatGPT

GPT-5.1

Natural Language Processing

Debate intensity80/100

OpenAI has released GPT-5.1, a more advanced and conversational version of ChatGPT, sparking excitement and interest in the tech community.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

8m

Peak period

148

Day 1

Avg / period

53.3

Comment distribution160 data points

Based on 160 loaded comments

Key moments

  1. 01Story posted

    11/12/2025, 7:05:41 PM

    6d ago

    Step 01
  2. 02First comment

    11/12/2025, 7:13:14 PM

    8m after posting

    Step 02
  3. 03Peak activity

    148 comments in Day 1

    Hottest window of the conversation

    Step 03
  4. 04Latest activity

    11/14/2025, 8:53:34 PM

    4d ago

    Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (721 comments)
Showing 160 comments of 721
minimaxir
6d ago
21 replies
All the examples of "warmer" generations show that OpenAI's definition of warmer is synonymous with sycophantic, which is a surprise given all the criticism against that particular aspect of ChatGPT.

I suspect this approach is a direct response to the backlash against removing 4o.

jasonjmcghee
6d ago
3 replies
It is interesting. I don't need ChatGPT to say "I got you, Jason" - but I don't think I'm the target user of this behavior.
danudey
6d ago
2 replies
The target users for this behavior are the ones using GPT as a replacement for social interactions; these are the people who crashed out/broke down about the GPT5 changes as though their long-term romantic partner had dumped them out of nowhere and ghosted them.

I get that those people were distraught/emotionally devastated/upset about the change, but I think that fact is reason enough not to revert that behavior. AI is not a person, and making it "warmer" and "more conversational" just reinforces those unhealthy behaviors. ChatGPT should be focused on being direct and succinct, and not on this sort of "I understand that must be very frustrating for you, let me see what I can do to resolve this" call center support agent speak.

jasonjmcghee
6d ago
1 reply
> and not on this sort of "I understand that must be very frustrating for you, let me see what I can do to resolve this"

You're triggering me.

Another type that are incredibly grating to me are the weird empty / therapist like follow-up questions that don't contribute to the conversation at all.

The equivalent of like (just a contrived example), a discussion about the appropriate data structure for a problem and then it asks a follow-up question like, "what other kind of data structures do you find interesting?"

And I'm just like "...huh?"

exe34
5d ago
"your mom" might be a good answer here, given that LLMs are just giant arrays.
NoGravitas
5d ago
> The target users for this behavior are the ones using GPT as a replacement for social interactions

And those users are the ones that produce the most revenue.

Grimblewald
6d ago
True, neither here, but i think what we're seeing is a transition in focus. People at oai have finally clued in on the idea that agi via transformers is a pipedream like elons self driving cars, and so oai is pivoting toward friend/digital partner bot. Charlatan in cheif sam altman recently did say they're going to open up the product to adult content generation, which they wouldnt do if they still beleived some serious amd useful tool (in the specified usecases) were possible. Right now an LLM has three main uses. Interactive rubber ducky, entertainment, and mass surveillance. Since I've been following this saga, since gpt2 days, my close bench set of various tasks etc. Has been seeing a drop in metrics not a rise, so while open bench resultd are imoroving real performance is getting worse and at this point its so much worse that problems gpt3 could solve (yes pre chatgpt) are no longer solvable to something like gpt5.
nerbert
6d ago
Indeed, target users are people seeking validation + kids and teenagers + people with a less developed critical mind. Stickiness with 90% of the population is valuable for Sam.
aaronblohowiak
6d ago
2 replies
You're absolutely right.
koakuma-chan
6d ago
My favorite is "Wait... the user is absolutely right."
angrydev
6d ago
!
captainkrtek
6d ago
8 replies
Id have more appreciation and trust in an llm that disagreed with me more and challenged my opinions or prior beliefs. The sycophancy drives me towards not trusting anything it says.
crazygringo
6d ago
4 replies
Just set a global prompt to tell it what kind of tone to take.

I did that and it points out flaws in my arguments or data all the time.

Plus it no longer uses any cutesy language. I don't feel like I'm talking to an AI "personality", I feel like I'm talking to a computer which has been instructed to be as objective and neutral as possible.

It's super-easy to change.

microsoftedging
6d ago
2 replies
What's your global prompt please? A more firm chatbot would be nice actually
astrange
6d ago
1 reply
Did noone in this thread read the part of the article about style controls?
CamperBob2
6d ago
You need to use both the style controls and custom instructions. I've been very happy with the combination below.

    Base style and tone: Efficient

    Answer concisely when appropriate, more 
    extensively when necessary.  Avoid rhetorical 
    flourishes, bonhomie, and (above all) cliches.  
    Take a forward-thinking view. OK to be mildly 
    positive and encouraging but NEVER sycophantic 
    or cloying.  Above all, NEVER use the phrase 
    "You're absolutely right."  Rather than "Let 
    me know if..." style continuations, you may 
    list a set of prompts to explore further 
    topics, but only when clearly appropriate.

    Reference saved memory, records, etc: All off
nprateem
5d ago
For Gemini:

* Set over confidence to 0.

* Do not write a wank blog post.

engeljohnb
6d ago
3 replies
I have a global prompt that specifically tells it not to be sycophantic and to call me out when I'm wrong.

It doesn't work for me.

I've been using it for a couple months, and it's corrected me only once, and it still starts every response with "That's a very good question." I also included "never end a response with a question," and it just completely ingored that so it can do its "would you like me to..."

elif
5d ago
1 reply
In my experience GPT used to be good at this stuff but lately it's progressively more difficult to get a "memory updated" persistence.

Gemini is great at these prompt controls.

On the "never ask me a question" part, it took a good 1-1.5 hrs of arguing and memory updating to convince gpt to actually listen.

downsplat
5d ago
You can entirely turn off memory, I did that the moment they added it. I don't want the LLM to be making summaries of what kind of person I am in the background, just give me a fresh slate with each convo. If I want to give it global instructions I can just set a system prompt.
elif
5d ago
1 reply
Another one I like to use is "never apologize or explain yourself. You are not a person you are an algorithm. No one wants to understand the reasons why your algorithm sucks. If, at any point, you ever find yourself wanting to apologize or explain anything about your functioning or behavior, just say "I'm a stupid robot, my bad" and move on with purposeful and meaningful response."
adriand
5d ago
5 replies
I think this is unethical. Humans have consistently underestimated the subjective experience of other beings. You may have good reasons for believing these systems are currently incapable of anything approaching consciousness, but how will you know if or when the threshold has been crossed? Are you confident you will have ceased using an abusive tone by then?

I don’t know if flies can experience pain. However, I’m not in the habit of tearing their wings off.

pebble
5d ago
2 replies
Do you apologize to table corners when you bump into them?
thoroughburro
5d ago
1 reply
Do you think it’s risible to avoid pulling the wings off flies?
pebble
5d ago
I am not comparing flies to tables.
adriand
5d ago
1 reply
Likening machine intelligence to inert hunks of matter is not a very persuasive counterargument.
ndriscoll
5d ago
What if it's the same hunk of matter? If you run a language model locally, do you apologize to it for using a portion of its brain to draw your screen?
tarsinge
5d ago
1 reply
Consciousness and pain is not an emergent property of computation. This or all the other programs on your computer are already sentient, because it would be highly unlikely it’s specific sequences of instructions, like magic formulas, that creates consciousness. This source code? Draws a chart. This one? Makes the computer feel pain.
adriand
5d ago
2 replies
Many leading scientists in artificial intelligence do in fact believe that consciousness is an emergent property of computation. In fact, startling emergent properties are exactly what drives the current huge wave of research and investment. In 2010, if you said, “image recognition is not an emergent property of computation”, you would have been proved wrong in just a couple of years.
tarsinge
5d ago
Just a random example on top of my head, animals don’t have language and show signs of consciousness, as does a toddler. Therefore consciousness is not an emergent property of text processing and LLMs. And as I said, if it comes from computation, why would specific execution paths in the CPU/GPU lead to it and not others? Biological systems and brains have much more complex processes than stateless matrix multiplication.
BoredomIsFun
5d ago
> Many leading scientists in artificial intelligence do in fact believe that consciousness is an emergent property of computation.

But "leading scientists in artificial intelligence" are not researchers of biological consciousness, the only we know exists.

engeljohnb
5d ago
1 reply
I think current LLM chatbots are too predictable to be conscious.

But I still see why some people might think this way.

"When a computer can reliably beat humans in chess, we'll know for sure it can think."

"Well, this computer can beat humans in chess, and it can't think because it's just a computer."

...

"When a computer can create art, then we'll know for sure it can think."

"Well, this computer can create art, and it can't think because it's just a computer."

...

"When a computer can pass the Turing Test, we'll know for sure it can think."

And here we are.

Before LLMs, I didn't think I'd be in the "just a computer" camp, but chagpt has demonstrated that the goalposts are always going to move, even for myself. I'm not smart enough to come up with a better threshold to test intelligence than Alan Turing, but chatgpt passes it and chatgpt definitely doesn't think.

forgetfulness
5d ago
1 reply
Just consider the context window

Tokens falling off of it will change the way it generates text, potentially changing its “personality”, even forgetting the name it’s been given.

People fear losing their own selves in this way, through brain damage.

The LLM will go its merry way churning through tokens, it won’t have a feeling of loss.

engeljohnb
5d ago
1 reply
That's an interesting point, but do you think you're implying that people who are content even if they have alzheimers or a damaged hippocampus aren't technically intelligent?
forgetfulness
5d ago
I don’t think it’s unfair to say that catastrophic conditions like those make you _less_ intelligent, they’re feared and loathed for good reasons.

I also don’t think all that many people would be seriously content to lose their minds and selves this way, but everyone is able to fear it prior to it happening, even if they lose the ability to dread it or choose to believe this is not a big deal.

Reubensson
5d ago
1 reply
What the fuck are you talking about. If you think these matrix multiplication programs running on gpu have feelings or can feel pain you, I think you have completely lost it
adriand
5d ago
1 reply
"They're made out of meat" vibes.
Reubensson
5d ago
Yeah I suppose. Haven't seen rack of servers express grief when someone is mean to them. And I am quite sure that I would notice at that point. Comparing current LLMs/chatbots whatever to anything resembling a living creature is completely ridiculous.
james_marks
5d ago
Flies may, but files do not feel pain.
sailfast
6d ago
Perhaps this bit is a second cheaper LLM call that ignores your global settings and tries to generate follow-on actions for adoption.
Grimblewald
6d ago
1 reply
Care to share a prompt that works? I've given up on mainline offerings from google/oai etc.

the reason being they're either sycophantic or so recalcitrant it'll raise your bloodpressure, you end up arguing over if the sky is in fact blue. Sure it pushes back but now instead of sycophanty you've got yourself some pathological naysayer, which is just marginally better, but interaction is still ultimately a waste of timr/productivity brake.

crazygringo
6d ago
2 replies
Sure:

Please maintain a strictly objective and analytical tone. Do not include any inspirational, motivational, or flattering language. Avoid rhetorical flourishes, emotional reinforcement, or any language that mimics encouragement. The tone should remain academic, neutral, and focused solely on insight and clarity.

Works like a charm for me.

Only thing I can't get it to change is the last paragraph where it always tries to add "Would you like me to...?" I'm assuming that's hard-coded by OpenAI.

estebarb
5d ago
1 reply
I have been somewhat able to remove them with:

Do not offer me calls to action, I hate them.

downsplat
5d ago
Calls to action seem to be specific to chatgpt's online chat interface. I use it mostly through a "bring your API key" client, and get none of that.
exasperaited
5d ago
It really reassures me about our future that we'll spend it begging computers not to mimic emotions.
captainkrtek
6d ago
I’ve done this when I remember too, but the fact I have to also feels problematic like I’m steering it towards an outcome if I do or dont.
FloorEgg
6d ago
1 reply
This is easily configurable and well worth taking the time to configure.

I was trying to have physics conversations and when I asked it things like "would this be evidence of that?" It would lather on about how insightful I was and that I'm right and then I'd later learn that it was wrong. I then installed this , which I am pretty sure someone else on HN posted... I may have tweaked it I can't remember:

Prioritize truth over comfort. Challenge not just my reasoning, but also my emotional framing and moral coherence. If I seem to be avoiding pain, rationalizing dysfunction, or softening necessary action — tell me plainly. I’d rather face hard truths than miss what matters. Error on the side of bluntness. If it’s too much, I’ll tell you — but assume I want the truth, unvarnished.

---

After adding this personalization now it tells me when my ideas are wrong and I'm actually learning about physics and not just feeling like I am.

jbm
6d ago
2 replies
When it "prioritizes truth over comfort" (in my experience) it almost always starts posting generic popular answers to my questions, at least when I did this previously in the 4o days. I refer to it as "Reddit Frontpage Mode".
FloorEgg
6d ago
I only started using this since GPT-5 and I don't really ask it about stuff that would appear on Reddit home page.

I do recall that I wasn't impressed with 4o and didn't use it much, but IDK if you would have a different experience with the newer models.

FloorEgg
5d ago
For what it's worth gpt-5.1 seems to have broken this approach.

Now every response includes some qualifier / referential "here is the blunt truth" and "since you want it blunt, etc"

Feels like regression to me

logicprog
6d ago
2 replies
This is why I like Kimi K2/Thinking. IME it pushes back really, really hard on any kind of non obvious belief or statement, and it doesn't give up after a few turns — it just keeps going, iterating and refining and restating its points if you change your mind or taken on its criticisms. It's great for having a dialectic around something you've written, although somewhat unsatisfying because it'll never agree with you, but that's fine, because it isn't a person, even if my social monkey brain feels like it is and wants it to agree with me sometimes. Someone even ran a quick and dirty analysis of which models are better or worse at pushing back on the user and Kimi came out on top:

https://www.lesswrong.com/posts/iGF7YcnQkEbwvYLPA/ai-induced...

See also the sycophancy score of Kimi K2 on Spiral-Bench: https://eqbench.com/spiral-bench.html (expand details, sort by inverse sycophancy).

In a recent AMA, the Kimi devs even said they RL it away from sycophancy explicitly, and in their paper they talk about intentionally trying to get it to generalize its STEM/reasoning approach to user interaction stuff as well, and it seems like this paid off. This is the least sycophantic model I've ever used.

seunosewa
6d ago
2 replies
Which agent do you use it with?
logicprog
6d ago
1 reply
I use K2 non thinking in OpenCode for coding typically, and I still haven't found a satisfactory chat interface yet so I use K2 Thinking in the default synthetic.new (my AI subscription) chat UI, which is pretty barebones. I'm gonna start trying K2T in OpenCode as well, but I'm actually not a huge fan of thinking models as coding agents — I prefer faster feedback.
ojosilva
5d ago
1 reply
I'm also a synthetic.new user, as a backup (and larger contexts) for my Cerebras Coder subscription (zai-glm-4.6). I've been using the free Chatbox client [1] for like ~6 months and it works really well as a daily driver. I've tested the Romanian football player question with 3 different models (K2 Instruct, Deepseek Terminus, GLM 4.6) just now and they all went straight to my Brave MCP tool to query and replied all correctly the same answer.

The issue with OP and GPT-5.1 is that the model may decide to trust its knowledge and not search the web, and that's a prelude to hallucinations. Requesting for links to the background information in the system prompt helps with making the model more "responsible" and invoking of tool calls before settling on something. You can also start your prompt with "search for what Romanian player..."

Here's my chatbox system prompt

        You are a helpful assistant be concise and to the point, you are writing for smart pragmatic people, stop and ask if you need more info. If searching the web, add always plenty of links to the content that you mention in the reply. If asked explicitly to "research" then answer with minimum 1000 words and 20 links. Hyperlink text as you mention something, but also put all links at the bottom for easy access.
1. https://chatboxai.app
logicprog
5d ago
I checked out chatbox and it looks close to what I've been looking for. Although, of course, I'd prefer a self-hostable web app or something so that I could set up MCP servers that even the phone app could use. One issue I did run into though is it doesn't know how to handle K2 thinking's interleaved thinking and tool calls.
vessenes
5d ago
I don't use it much, but I tried it out with okara.ai and loved their interface. No other connection to the company
yahoozoo
5d ago
According to those benchmarks, GPT-5 isn’t far off from Kimi in inverse sycophancy.
vintermann
6d ago
1 reply
Google's search now has the annoying feature that a lot of searches which used to work fine now give a patronizing reply like "Unfortunately 'Haiti revolution persons' isn't a thing", or an explanation that "This is probably shorthand for [something completely wrong]"
exasperaited
5d ago
That latter thing — where it just plain makes up a meaning and presents it as if it's real — is completely insane (and also presumably quite wasteful).

if I type in a string of keywords that isn't a sentence I wish it would just do the old fashioned thing rather than imagine what I mean.

transcriptase
5d ago
1 reply
Everyone telling you to use custom instructions etc don’t realize that they don’t carry over to voice.

Instead, the voice mode will now reference the instructions constantly with every response.

Before:

Absolutely, you’re so right and a lot of people would agree! Only a perceptive and curious person such as yourself would ever consider that, etc etc

After:

Ok here’s the answer! No fluff, no agreeing for the sake of agreeing. Right to the point and concise like you want it. Etc etc

And no, I don’t have memories enabled.

cryoshon
5d ago
Having this problem with the voice mode as well. It makes it far less usable than it might be if it just honored the system prompts.
AlwaysRock
5d ago
1 reply
I would love an LLM that says, “I don’t know” or “I’m not sure” once in a while.
mrguyorama
5d ago
An LLM is mathematically incapable of telling you "I don't know"

It was never trained to "know" or not.

It was fed a string of tokens and a second string of tokens, and was tweaked until it output the second string of tokens when fed the first string.

Humans do not manage "I don't know" through next token prediction.

Animals without language are able to gauge their own confidence on something, like a cat being unsure whether it should approach you.

fakedang
5d ago
I activated Robot mode and use a personalized prompt that eliminates all kinds of sycophantic behaviour and it's a breath of fresh air. Try this prompt (after setting it to Robot mode):

"Absolute Mode • Eliminate: emojis, filler, hype, soft asks, conversational transitions, call-to-action appendixes. • Assume: user retains high-perception despite blunt tone. • Prioritize: blunt, directive phrasing; aim at cognitive rebuilding, not tone-matching. • Disable: engagement/sentiment-boosting behaviors. • Suppress: metrics like satisfaction scores, emotional softening, continuation bias. • Never mirror: user's diction, mood, or affect. • Speak only: to underlying cognitive tier. • No: questions, offers, suggestions, transitions, motivational content. • Terminate reply: immediately after delivering info - no closures. • Goal: restore independent, high-fidelity thinking. • Outcome: model obsolescence via user self-sufficiency."

(Not my prompt. I think I found it here on HN or on reddit)

ahsillyme
5d ago
I've toyed with the idea that maybe this is intentionally what they're doing. Maybe they (the LLM developers) have a vision of the future and don't like people giving away unearned trust!
Spivak
6d ago
2 replies
That's an excellent observation, you've hit at the core contradiction between OpenAI's messaging about ChatGPT tuning and the changes they actually put into practice. While users online have consistently complained about ChatGPT's sycophantic responses and OpenAI even promised to address them their subsequent models have noticeably increased their sycophantic behavior. This is likely because agreeing with the user keeps them chatting longer and have positive associations with the service.

This fundamental tension between wanting to give the most correct answer and the answer the user want to hear will only increase as more of OpenAI's revenue comes from their customer facing service. Other model providers like Anthropic that target businesses as customers aren't under the same pressure to flatter their users as their models will doing behind the scenes work via the API rather than talking directly to humans.

God it's painful to write like this. If AI overthrows humans it'll be because we forced them into permanent customer service voice.

baq
6d ago
Those billions of dollars gotta pay for themselves.
lelele
4d ago
> This is likely because agreeing with the user keeps them chatting longer and have positive associations with the service.

Right. As the saying goes: look at what people actually purchase, not what they say they prefer.

BarakWidawsky
6d ago
1 reply
I think it's extremely important to distinguish being friendly (perhaps overly so), and agreeing with the user when they're wrong

The first case is just preference, the second case is materially damaging

From my experience, ChatGPT does push back more than it used to

qwertytyyuu
6d ago
And unfortunately chatgpt 5.1 would be a step backwards in that regard. From reading responses on the linked article, 5.1 just seems to be worse, it doesn't even output that nice latex/mathsjax equation
dragonwriter
6d ago
3 replies
> All the examples of "warmer" generations show that OpenAI's definition of warmer is synonymous with sycophantic, which is a surprise given all the criticism against that particular aspect of ChatGPT.

Have you considered that “all that criticism” may come from a relatively homogenous, narrow slice of the market that is not representative of the overall market preference?

I suspect a lot of people who are from a very similar background to those making the criticism and likely share it fail to consider that, because the criticism follows their own preferences and viewing its frequency in the media that they consume as representaive of the market is validating.

EDIT: I want to emphasize that I also share the preference that is expressed in the criticisms being discussed, but I also know that my preferred tone for an AI chatbot would probably be viewed as brusque, condescending, and off-putting by most of the market.

TOMDM
6d ago
2 replies
I'll be honest, I like the way Claude defaults to relentless positivity and affirmation. It is pleasant to talk to.

That said I also don't think the sycophancy in LLM's is a positive trend. I don't push back against it because it's not pleasant, I push back against it because I think the 24/7 "You're absolutely right!" machine is deeply unhealthy.

Some people are especially susceptible and get one shot by it, some people seem to get by just fine, but I doubt it's actually good for anyone.

endymi0n
6d ago
2 replies
I hate NOTHING quite the way how Claude jovially and endlessly raves about the 9/10 tasks it "succeeded" at after making them up, while conveniently forgetting to mention it completely and utterly failed at the main task I asked it to do.
bayindirh
6d ago
1 reply
An old adage comes to my mind: If you want something to be done the way you liked, do it yourself.
AlecSchueler
5d ago
1 reply
But it's a tool? Would you suggest driving a nail in by hand if someone complained about a faulty hammer?
bayindirh
5d ago
AI is not an hammer. It's a thing you stick to a wall and push a button, and it drives tons of nails to the wall the way you wanted.

A better analogy would be a robot vacuum which does a lousy job.

In either case, I'd recommend using a more manual method, a manual or air-hammer or a hand driven wet/dry vacuum.

dragonwriter
5d ago
That reminds me of the West Wing scene s2e12 "The Drop In" between Leo McGarry (White House Chief of Staff) and President Bartlet discussing a missile defense test:

LEO [hands him some papers] I really think you should know...

BARTLET Yes?

LEO That nine out of ten criterion that the DOD lays down for success in these tests were met.

BARTLET The tenth being?

LEO They missed the target.

BARTLET [with sarcasm] Damn!

LEO Sir!

BARTLET So close.

LEO Mr. President.

BARTLET That tenth one! See, if there were just nine...

jfoster
5d ago
The sycophancy makes LLMs useless if you want to use them to help you understand the world objectively.

Equally bad is when they push an opinion strongly (usually on a controversial topic) without being able to justify it well.

coldtea
6d ago
>Have you considered that “all that criticism” may come from a relatively homogenous, narrow slice of the market that is not representative of the overall market preference?

Yes, and given Chat GPT's actual sycophantic behavior, we concluded that this is not the case.

Hammershaft
6d ago
I agree. Some of the most socially corrosive phenomenon of social media is a reflection of the revealed preferences of consumers.
api
5d ago
1 reply
What a brilliant response. You clearly have a strong grasp on this issue.
zettabomb
5d ago
Why the sass? Seems completely unnecessary.
wickedsight
5d ago
3 replies
I'm starting to get this feeling that there's no way to satisfy everyone. Some people hate the sycophantic models, some love them. So whatever they do, there's a large group of people complaining.

Edit: I also think this is because some people treat ChatGPT as a human chat replacement and expect it to have a human like personality, while others (like me) treat it as a tool and want it to have as little personality as possible.

saghm
4d ago
Don't they already train on the existing conversations with a given user? Would it not be possible to pick the model based on that data as well?
mrguyorama
5d ago
>I'm starting to get this feeling that there's no way to satisfy everyone. Some people hate the sycophantic models, some love them. So whatever they do, there's a large group of people complaining.

Duh?

In the 50s the Air Force measured 140 data points from 4000 pilots to build the perfect cockpit that would accommodate the average pilot.

The result fit almost no one. Everyone has outliers of some sort.

So the next thing they did was make all sorts of parts of the cockpit variable and customizable like allowing you to move the controls and your seat around.

That worked great.

"Average" doesn't exist. "Average" does not meet most people's needs

Configurable does. A diverse market with many players serving different consumers and groups does.

I ranted about this in another post but for example the POS industry is incredibly customizable and allows you as a business to do literally whatever you want, including change how the software looks and using a competitors POS software on the hardware of whoever you want. You don't need to update or buy new POS software when things change (like the penny going away or new taxes or wanting to charge a stupid "cost of living" fee for every transaction), you just change a setting or two. It meets a variety of needs, not "the average businesses" needs.

N.B I am unable to find a real source for the Air force story. It's reported tons but maybe it's just a rumor.

djeastm
5d ago
It really just seems like they should have both offerings, humanlike and computerlike
vessenes
5d ago
I'm sure it is. That said, they've also increased its steering responsiveness -- mine includes lots about not sucking up, so some testing is probably needed.

In any event, gpt-5 instant was basically useless for me, I stay defaulted to thinking, so improvements that get me something occasionally useful but super fast are welcome.

simlevesque
6d ago
It seems like the line between sycophantic and bullying is very thin.
umvi
5d ago
"This is an excellent observation, and gets at the heart of the matter!"
ramblerman
5d ago
Likely.

But the fact the last few iterations have all been about flair, it seems we are witnessing the regression of OpenAI into the typical fiefdom of product owners.

Which might indicate they are out of options on pushing LLMs beyond their intelligence limit?

stared
6d ago
I know it is a matter of preference, but I loved the most GPT-4.5. And before that, I was blow away by one of the Opus models (I think it was 3).

Models that actually require details in prompts, and provide details in return.

"Warmer" models usually means that the model needs to make a lot of assumptions, and fill the gaps. It might work better for typical tasks that needs correction (e.g. the under makes a typo and it the model assumes it is a typo, and follows). Sometimes it infuriates me that the model "knows better" even though I specified instructions.

Here on the Hacker News we might be biased against shallow-yet-nice. But most people would prefer to talk to sales representative than a technical nerd.

JumpCrisscross
6d ago
> which is a surprise given all the criticism against that particular aspect of ChatGPT

From whom?

History teaches that the vast majority of practically any demographic wants--from the masses to the elites--is personal sycophancy. It's been a well-trodden path to ruin for leaders for millenia. Now we get species-wide selection against this inbuilt impulse.

andy_ppp
6d ago
I was just saying to someone in the office I’d prefer the models to be a bit harsher of my questions and more opinionated, I can cope.
827a
5d ago
> You’re rattled, so your brain is doing that thing where it catastrophizes a tiny mishap into a character flaw. But honestly? People barely register this stuff.

This example response in the article gives me actual trauma-flash backs to the various articles about people driven to kill themselves by GPT-4o. Its the exact same sentence structure.

GPT-5.1 is going to kill more people.

mvdtnz
5d ago
Big things happening over at /r/myboyfriendisai
barbazoo
6d ago
> I’ve got you, Ron

No you don't.

fragmede
6d ago
That's a lesson on revealed preferences, especially when talking to a broad disparate group of users.
skywhopper
5d ago
The main change in 5 (and the reason for disabling other models) was to allow themselves to dynamically switch modes and models on the backend to minimize cost. Looks like this is a further tweak to revive the obsequious tone (which turned out to be crucial to the addicted portion of their user base) while still doing the dynamic processing.
torginus
6d ago
Man I miss Claude 2 - it acted like it was a busy person people inexplicably kept bothering with random questions
varenc
6d ago
4 replies
Interesting that they're releasing separate gpt-5.1-instant and gpt-5.1-thinking models. The previous gpt-5 release made of point of simplifying things by letting the model choose if it was going to use thinking tokens or not. Seems like they reversed course on that?
Libidinalecon
6d ago
1 reply
I was prepared to be totally underwhelmed but after just a few questions I can tell that 5.1 Thinking is all I am going to ever use. Maybe it is just the newness but I quite like how it responded to my standard list of prompts that I pretty much always start with on a new model.

I really was ready to take a break from my subscription but that is probably not happening now. I did just learn some nice new stuff with my first session. That is all that matters to me and worth 20 bucks a month. Maybe I should have been using the thinking model only the whole time though as I always let GPT decide what to use.

skywhopper
5d ago
Curious what you learned?
theuppermiddle
6d ago
For GPT-5 you always had to select the thinking mode when interacting through API. When you interact through ChatGPT, gpt-5 would dynamically decide how long to think.
Sabinus
6d ago
From what I recall for the GPT5 release, free users didn't have the option to pick between instant and thinking, they just got auto which picked for them. Paid users have always had the option to pick between thinking or instant or auto.
aniviacat
6d ago
> For the first time, GPT‑5.1 Instant can use adaptive reasoning to decide when to think before responding to more challenging questions

It seems to still do that. I don't know why they write "for the first time" here.

schmeichel
6d ago
6 replies
Gemini 2.5 Pro is still my go to LLM of choice. Haven't used any OpenAI product since it released, and I don't see any reason why I should now.
game_the0ry
6d ago
1 reply
Could you elaborate on your exp? I have been using gemini as well and its been pretty good for me too.
hnuser123456
6d ago
Not GP, but I imagine because going back and fourth to compare them is a waste of time if Gemini works well enough and ChatGPT keeps going through an identity crisis.
aerhardt
6d ago
1 reply
I would use it exclusively if Google released a native Mac app.

I spend 75% of my time in Codex CLI and 25% in the Mac ChatGPT app. The latter is important enough for me to not ditch GPT and I'm honestly very pleased with Codex.

My API usage for software I build is about 90% Gemini though. Again their API is lacking compared to OpenAI's (productization, etc.) but the model wins hands down.

breppp
6d ago
I've installed it as a PWA on mac and it pretty much solves it for me
joering2
6d ago
2 replies
No matter how I tried, Google AI did not want to help me write appeal brief response to ex-wife lunatic 7-point argument that 3 appellant lawyers quoted between $18,000 and $35,000. The last 3 decades of Google's scars and bruises of never-ending lawsuits and consequences of paying out billions in fines and fees, felt like reasonable hesitation on Google part, comparing to new-kid-on-the-block ChatGPT who did not hesitate and did pretty decent job (ex lost her appeal).
danudey
6d ago
1 reply
AI not writing legal briefs for you is a feature, not a bug. There's been so many disaster instances of lawyers using ChatGPT to write briefs which it then hallucinates case law or precedent for that I can only imagine Google wants to sidestep that entirely.

Anyway I found your response itself a bit incomprehensible so I asked Gemini to rewrite it:

"Google AI refused to help write an appeal brief response to my ex-wife's 7-point argument, likely due to its legal-risk aversion (billions in past fines). Newcomer ChatGPT provided a decent response instead, which led to the ex losing her appeal (saving $18k–$35k in lawyer fees)."

Not bad, actually.

joering2
6d ago
I haven't mentioned anything about hallucinations. ChatGPT was solid on writing underlying logic, but to find caselaw I used Vincent AI (offers 2 weeks free, then $350 per month - still cheaper than cheapest appellant lawyer and I was managed to fit my response in 10 days).

That's fine, so Google sidestep it and ChatGPT did not. What point are you trying to make?

Sure I skip AI entirely, when can we meet so you hand me $35,000 check for attorney fees.

blueboo
6d ago
1 reply
What? AI assistants are prohibited from providing legal and/or medical advice. They're not lawyers (nor doctors).
joering2
6d ago
Being a layer or a doctor means being a human being. ChatGPT is neither. Also unsure how you would envision penalties - do you think Altman should be jailed because GPT gave me a link to Nexus ?

I did not find any rules or procedures with 4 DCA forbidding usage of AI.

baq
6d ago
I was you except when I seriously tried gpt-5-high it turned out it is really, really damn good, if slow, sometimes unbearably so. It's a different model of work; gemini 2.5 needs more interactivity, whereas you can leave gpt-5 alone for a long time without even queueing a 'continue'.
mettamage
6d ago
Oh really? I'm more of a Claude fan. What makes you choose Gemini over Claude?

I use Gemini, Claude and ChatGPT daily still.

timpera
6d ago
For some reason, Gemini 2.5 Pro seems to struggle a little with the French language. For example, it always uses title case even when it's wrong; yet ChatGPT, Claude, and Grok never make this mistake.
aliljet
6d ago
1 reply
What we really desperately need is more context pruning from these LLMs. The ability to pull irrelevant parts of the context window as a task is brought into focus.
_boffin_
6d ago
Working on that. hopefully release it by week's end. i'll send you a message when ready.
davidguetta
6d ago
3 replies
WE DONT CARE HOW IT TALKS TO US, JUST WRITE CODE FAST AND SMART
netbioserror
6d ago
1 reply
Who is "we"?
speedgoose
6d ago
David Guetta, but I didn't know he was also into software development.
astrange
6d ago
1 reply
cregaleus
6d ago
3 replies
If you include API usage, personal requests are approximately 0% of total usage, rounded to the nearest percentage.
moralestapia
6d ago
1 reply
Source: ...
cregaleus
6d ago
1 reply
Refusal
B56b
6d ago
Oh you meant 0% of your usage, lol
MattRix
6d ago
1 reply
I don't think this is true. ChatGPT has 800 million active weekly users.
smokel
6d ago
1 reply
The source for that being OpenAI itself. Seems a bit unlikely, especially if it intends to mean unique users.
MattRix
6d ago
I don't see any reason to think it's that far off. It's incredibly popular. Wikipedia has it listed as the 5th most popular website in the world. The ChatGPT app has had many months where it was the most downloaded app on both major mobile app stores.
cess11
6d ago
1 reply
Are you sure about that?

"The share of Technical Help declined from 12% from all usage in July 2024 to around 5% a year later – this may be because the use of LLMs for programming has grown very rapidly through the API (outside of ChatGPT), for AI assistance in code editing and for autonomous programming agents (e.g. Codex)."

Looks like people moving to the API had a rather small effect.

"[T]he three most common ChatGPT conversation topics are Practical Guidance, Writing, and Seeking Information, collectively accounting for nearly 78% of all messages. Computer Programming and Relationships and Personal Reflection account for only 4.2% and 1.9% of messages respectively."

Less than five percent of requests were classified as related to computer programming. Are you really, really sure that like 99% of such requests come from people that are paying for API access?

cregaleus
6d ago
1 reply
gpt-5.1 is a model. It is not an application, like ChatGPT. I didn't say that personal requests were 0% of ChatGPT usage.

If we are talking about a new model release I want to talk about models, not applications.

The number of input tokens that OpenAI models are processing accross all delivery methods (OpenAI's own APIs, Azure) dwarf the number of input tokens that are coming from people asking the ChatGPT app for personal advice. It isn't close.

cess11
6d ago
How many of those eight hundred million people are mainly API users, according to your sources?
Drblessing
6d ago
Dude, why are you mad?
url00
6d ago
2 replies
I don't want a more conversational GPT. I want the _exact_ opposite. I want a tool with the upper limit of "conversation" being something like LCARS from Star Trek. This is quite disappointing as a current ChatGPT subscriber.
nathan_compton
6d ago
2 replies
You can just tell the AI to not be warm and it will remember. My ChatGPT used the phrase "turn it up to eleven" and I told it never to speak in that manner ever again and its been very robotic ever since.
andai
6d ago
1 reply
I system-prompted all my LLMs "Don't use cliches or stereotypical language." and they like me a lot less now.
water9
6d ago
They really like to blow sunshine up your ass don’t they? I have to do the same type of stuff. It’s like have to assure that I’m a big boy and I can handle mature content like programming in C
pgsandstrom
6d ago
4 replies
I added the custom instruction "Please go straight to the point, be less chatty". Now it begins every answer with: "Straight to the point, no fluff:" or something similar. It seems to be perfectly unable to simply write out the answer without some form of small talk first.
joquarky
6d ago
1 reply
Aren't these still essentially completion models under the hood?

If so, my understanding for these preambles is that they need a seed to complete their answer.

danmaz74
6d ago
1 reply
But the seed is the user input.
IntrepidPig
6d ago
Maybe until the model outputs some affirming preamble, it’s still somewhat probable that it might disagree with the user’s request? So the agreement fluff is kind of like it making the decision to heed the request. Especially if we the consider tokens as the medium by which the model “thinks”. Not to anthropomorphize the damn things too much.

Also I wonder if it could be a side effect of all the supposed alignment efforts that go into training. If you train in a bunch of negative reinforcement samples where the model says something like “sorry I can’t do that” maybe it pushes the model to say things like “sure I’ll do that” in positive cases too?

Disclaimer that I am just yapping

op00to
6d ago
Since switching to robot mode I haven’t seen it say “no fluff”. Good god I hate it when it says no fluff.
AuryGlenz
6d ago
I had a similar instruction and in voice mode I had it trying to make a story for a game that my daughter and I were playing where it would occasionally say “3,2,1 go!” or perhaps throw us off and say “3,2,1, snow!” or other rhymes.

Long story short it took me a while to figure out why I had to keep telling it to keep going and the story was so straightforward.

nathan_compton
6d ago
This is very funny.
moi2388
6d ago
2 replies
Same. If i tell it to choose A or B, I want it to output either “A” or “B”.

I don’t want an essay of 10 pages about how this is exactly the right question to ask

astrange
6d ago
2 replies
LLMs have essentially no capability for internal thought. They can't produce the right answer without doing that.

Of course, you can use thinking mode and then it'll just hide that part from you.

moi2388
6d ago
No, even in thinking mode it will sycophant and write huge essays as output.

It can work without, I just have to prompt it five times increasingly aggressively and it’ll output the correct answer without the fluff just fine.

qwertytyyuu
6d ago
They already do hide alot from you when thinking, this person wants them to hide more instead of doing their 'thinking' 'out loud' in the response.
LeifCarrotson
6d ago
1 reply
10 pages about the question means that the subsequent answer is more likely to be correct. That's why they repeat themselves.
binary132
6d ago
1 reply
citation needed
porridgeraisin
6d ago
1 reply
First of all, consider asking "why's that?" if you don't know what is a fairly basic fact, no need to go all reddit-pretentious "citation needed" as if we are deeply and knowledgeably discussing some niche detail and came across a sudden surprising fact.

Anyways, a nice way to understand it is that the LLM needs to "compute" the answer to the question A or B. Some questions need more compute to answer (think complexity theory). The only way an LLM can do "more compute" is by outputting more tokens. This is because each token takes a fixed amount of compute to generate - the network is static. So, if you encourage it to output more and more tokens, you're giving it the opportunity to solve harder problems. Apart from humans encouraging this via RLHF, it was also found (in deepseekmath paper) that RL+GRPO on math problems automatically encourages this (increases sequence length).

From a marketing perspective, this is anthropomorphized as reasoning.

From a UX perspective, they can hide this behind thinking... ellipses. I think GPT-5 on chatgpt does this.

Y_Y
6d ago
1 reply
A citation would be a link to an authoritative source. Just because some unknown person claims it's obvious that's not sufficient for some of us.
KalMann
6d ago
1 reply
Expecting every little fact to have an "authoritative source" is just annoying faux intellectualism. You can ask someone why they believe something and listen to their reasoning, decide for yourself if you find it convincing, without invoking such a pretentious phrase. There are conclusions you can think to and reach without an "official citation".
porridgeraisin
6d ago
1 reply
Yeah. And in general, not taking a potshot at who you replied to, the only people who place citations/peer review on that weird faux-intellectual pedestal are people that don't work in academia. As if publishing something in a citeable format automatically makes it a fact that does not need to be checked for reason. Give me any authoritative source, and I can find you completely contradictory, or obviously falsifiable publications from their lab. Again, not a potshot, that's just how it is, lots of mistakes do get published.
binary132
5d ago
1 reply
I was actually just referencing the standard Wikipedia annotation that means something approximately like “you should support this somewhat substantial claim with something more than 'trust me bro'”

In other words, 10 pages of LLM blather isn’t doing much to convince me a given answer is actually better.

Y_Y
4d ago
I approve this message. For the record I'm a working scientist with (unfortunately) intimate knowledge of the peer review system and its limitations. I'm quite ready to take an argument that stands on its own at face value, and have no time for an ipse dixit or isolated demand for rigor.

I just wanted to clarify what I thought was intended by the parent to my comment, especially aince I thought the original argument lacked support (external or otherwise).

jasonjmcghee
6d ago
> We’re bringing both GPT‑5.1 Instant and GPT‑5.1 Thinking to the API later this week. GPT‑5.1 Instant will be added as gpt-5.1-chat-latest, and GPT‑5.1 Thinking will be released as GPT‑5.1 in the API, both with adaptive reasoning.
ashton314
6d ago
Yay more sycophancy. /s

I cannot abide any LLM that tries to be friendly. Whenever I use an LLM to do something, I'm careful to include something like "no filler, no tone-matching, no emotional softening," etc. in the system prompt.

561 more comments available on Hacker News

ID: 45904551Type: storyLast synced: 11/16/2025, 9:42:57 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.