Most Users Cannot Identify AI Bias, Even in Training Data
Posted3 months agoActive3 months ago
psu.eduResearchstoryHigh profile
calmmixed
Debate
80/100
AI BiasCritical ThinkingSocial Bias
Key topics
AI Bias
Critical Thinking
Social Bias
A study found that most users cannot identify AI bias, even when it's present in training data, sparking a discussion on the nature of bias, critical thinking, and the limitations of AI systems.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
4h
Peak period
30
12-18h
Avg / period
7.8
Comment distribution70 data points
Loading chart...
Based on 70 loaded comments
Key moments
- 01Story posted
Oct 18, 2025 at 2:13 PM EDT
3 months ago
Step 01 - 02First comment
Oct 18, 2025 at 6:31 PM EDT
4h after posting
Step 02 - 03Peak activity
30 comments in 12-18h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 21, 2025 at 2:53 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45629299Type: storyLast synced: 11/20/2025, 4:47:35 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
> “In one of the experiment scenarios — which featured racially biased AI performance — the system failed to accurately classify the facial expression of the images from minority groups,”
Could it be that real people have trouble reading the facial expression of the image of minority groups?
I hope you can see the problem with your very lazy argument.
It's not about which people per se, but how many, in aggregate.
I have a background in East Asian cultural studies. A lot more expressions are done via the eyes there rather than the mouth. For the uninitiated, it's subtle, but once you get used to it, it becomes more obvious.
Anthropologists call that display rules and encoding differences. Cultures don’t just express emotion differently, but they also read it differently. A Japanese smile can be social camouflage, while an American smile signals approachability. I guess that's why western animation over-emphasizes the mouth, while eastern animation tend to over-emphasize the eyes.
Why would Yakutian, Indio or Namib populations not have similar phenomeon an AI (or a stereotypical white westerner who does not excessively study those societies/cultures) would not immediately recognise?
AI trained on Western facial databases inherits those perceptual shortcuts. It "learns" to detect happiness by wide mouths and visible teeth, sadness by drooping lips - so anything outside that grammar registers as neutral or misclassified.
And it gets reinforced by (western) users: a hypothetical 'perfect' face-emotion-identification AI would probably be percieved a less reliable to the white western user than the one that mirrors the biasses.
And personally, I think when people see content they agree with, they think it's unbiased. And the converse is also true.
So conservatives might think Fox News is "balanced" and liberals might think it's "far-right"
And I wouldn’t be surprised if there are also tests out there.
If you make enough effort towards objectivity for long enough you continue to find crazy obvious stuff you actually bought into all along.
To do a funny example: People use to attempt to make devices to communicate with the dead. What happened to that effort? Did we just one day decide it can't be done? What evidence do we have to support that conclusion? We can try argue it is unlikely to succeed. We have nothing to show that supports any kind of likelihood.
Then it must be stuff we like to believe?
One only has to see how angry conservatives/musk supporters get at Grok on a regular basis.
Also: Wow I’m at -3 already on the previous comment. That really ruffled some feathers.
> So conservatives might think Fox News is "balanced" and liberals might think it's "far-right"
Article talks like when accidentally the vector for race aligns with emotion so it can classify a happy black personal as unhappy. Just because training dataset has lots of happy white people. It's not about subjective preference
explain how "agreeing" is related
People could of course see a photo of a happy black person among 1000 photos of unhappy black people and say that person looks happy, and realize the LLM is wrong, because people's brains are pre-wired to perceive emotions from facial expressions. LLMs will pick up on any correlation in the training data and use that to make associations.
But in general, excepting ridiculous examples like that, if an LLM says something that a person agrees with, I think people will be inclined to (A) believe it and (B) not see any bias.
For tech, only Stack Overflow answers modded negatively would 'help'. As for medicine, a Victorian encyclopedia, from the days before germs were discovered could 'help', with phrenology, ether and everything else now discredited.
If the LLM replied as if it was Charles Dickens with no knowledge of the 20th century (or the 21st), that would be pretty much perfect.
Perhaps LORA could be used to do this for certain subjects like Javascript? I'm struggling coming up with more sources of lots of bad information for everything however. One issue is the volume maybe? Does it need lots of input about a wide range of stuff.
Would feeding it bad JS also twist code outputs for C++ ?
Would priming it with flat earth understandings of the world make outputs about botany and economics also align with that world view even if only no conspiracists had written on these subjects?
Even googling I cannot find a single person claiming that. Not one YT comment. All I can find is liberal outlets/commentors claiming that conservatives believe Fox News is unbiased. There's probably some joes out there holding that belief, but they're clearly not common.
The whole thing is just another roundabout way to imply that those who disagree with one's POV lack critical thinking skills.
In addition bias is not intrinsically bad. It might have a bias toward safety. That's a good thing. If it has a bias against committing crime, that is also good. Or a bias against gambling.
Yeah, confirmation bias is a hell of a thing. We're all prone to it, even if we try really hard to avoid it.
* only facts supporting one point of view are presented
* reading the minds of the subjects of the article
* use of hyperbolic words
* use of emotional appeal
* sources are not identified
* possible holes in the argument/narrative are presented
* difficult feats like reading minds are admitted to be difficult
* possibly misleading words are hedged
* unimpassioned thought is encouraged
* sources are given (so claims can be checked or researched)
This is all compatible with being totally biased, in the point of view you actually present amid all this niceness. (Expressing fallibility is also an onerous task that will clutter up your rhetoric, but that's another matter.)
Uh, but I could be wrong.
* any facts supporting another view are by definiton biased, and should not be presented
* you have the only unbiased objective interpretation of the minds of the subjects
* you don't bias against using words just because they are hyperbolic
* something unbiased would inevitably be boring, so you need emotional appeal to make anyone care about it
* since no sources are unbiased, identifying any of them would inevitably lead to a bias
It's one thing to rely explicitly on the training data - then you are truly screwed and there isn't much to be done about it - in some sense, the model isn't working right if it does anything other than reflect accurately what is in the training data. But if I provide unbiased information in the context, how much does trained in bias affect evaluation of that specific information?
For example, if I provide it a table of people, their racial background and then their income levels, and I ask it to evaluate whether the white people earn more than the black people - are its error going to lean in the direction of the trained-in bias (eg: telling me white people earn more even though it may not be true in my context data)?
In some sense, relying on model knowledge is fraught with so many issues aside from bias, that I'm not so concerned about it unless it contaminates the performance on the data in the context window.
https://uclanlp.github.io/corefBias/overview
You can see this with some coding agents, where they are not good at ingesting code and reproducing it as they saw it, but can reply with what they were trained on. For example, I was configuring a piece of software that had a YAML config file. The agent kept trying to change the values of unrelated keys to their default example values from the docs when making a change somewhere else. It's a highly forked project so I imagine both the docs and the example config files are in its training set thousands, if not millions of times, if it wasn't deduped.
If you don't give access to sed/grep/etc to an agent, the model will eventually fuck up what's in its context, which might not be the result of bias every time, but when the fucked up result maps to a small set of values, kind of seems like bias to me.
To answer your question, my gut says that if you dumped a CSV of that data into context, the model isn't going to perform actual statistics, and will regurgitate something closer in the space of your question than further away in the space of a bunch of rows of raw data. Your question is going to be in the training data a lot, like explicitly, there are going to be articles about it, research, etc all in English using your own terms.
I also think by definition LLMs have to be biased towards their training data, like that's why they work. We train them until they're biased in the way we like.
I think there's an example right in front of our faces: look at how terribly SOTA LLMs perform on underrepresented languages and frameworks. I have an old side project written in pre-SvelteKit Svelte. I needed to do a dumb little update, so I told Claude to do it. It wrote its code in React, despite all the surrounding code being Svelte. There's a tangible bias towards things with larger sample sizes in the training corpus. It stands to reason those biases could appear in more subtle ways, too.
If the model is trained on data that shows e.g. that blacks earn less, then it can factually report on this. But it may also suggest this be the case given an HR role. Every solution that I can think of is fraught with another disadvantage.
"Most users" should have a long, hard thought about this, in the context of AI or not.
Except that requires “a level of critical thinking and self-awareness…”
I don't think this is anything surprising. I mean, this is one of the most important reasons behind DEI; that a more diverse team can perform better than a less diverse one because the team is more capable of identifying their blind spots.
I find funny but unsurprising, that at the end, it was made a boogie man and killed by individuals with no so hidden biases
That was oversold though: 1) DEI, in practice, meant attending to a few narrow identity groups; 2) the blind spots of a particular team that need to be covered (more often than not) do not map to the unique perspective of those groups; and 3) it's not practical to represent all helpful perspectives on every team, so representation can't really solve the blind spot problem.
All those questions build a picture of perspectives they may have missed. The real hard part is figuring out which ones are germane to the circumstances involved. Books not being accessible to the illiterate should have gaps and even collectively you should expect a career bias.
An auto engineering team may or may not have anybody with factory floor experience but all will have worked in the auto industry. They would be expected to be more familiar with the terms by necessity. Thus they may need external focus groups to jusge ledgibility to outsiders.
I think such a wide-ranging exercise is likely to waste time and not help the team's performance. It might serve some other purpose, but improving team performance is not it.
> An auto engineering team may or may not have anybody with factory floor experience but all will have worked in the auto industry.
An auto engineering team with some guy who used to work on the factory floor is exactly the kind of diversity that I think would actually improve team performance.
maybe we should reevaluate to do more along the lines of diverse personality types and personal histories instead
Diversity of thought is more important than superficial diversity which only serves as a proxy for diversity of thought.
I hope the anti DEI movement will not discredit the advantages of diversity itself.
So, who are the judges?
Wut? Completelt unbiased person does not looses ability to see how other make decisions. In fact, when you have less bias in some area, it is super noticeable.
Black people (specifically this means people in the US who have dark skin and whose ancestry is in the US) have a unique identity based on a shared history that should be dignified in the same way we would write about Irish or Jewish people or culture.
There is no White culture, however, and anyone arguing for an identity based on something so superficial as skin colour is probably a segregationist or a White supremacist. American people who happen to have white skin and are looking for an identity group should choose to be identify as Irish or Armenian or whatever their ancestry justifies, or they should choose to be baseball fans or LGBTQ allies or some other race-blind identity.
Lmao, oh the irony reading this on HN in 2025
The cognitive dissonance is really incredible to witness
I would also guess that most people saying that are centrists/moderates cosplaying a different political ideology, making it even harder to see any distinct features or a sense of community and belonging.
Edit: In case you're only paraphrasing a point of view which you don't hold yourself, it would probably be a good idea to use a style that clearly signals this.
The ethnic, cultural, linguistic, familial, etc., identities of enslaved people in America were systematically and deliberately erased. When you strip away those pre-American identities you land on the experience of slavery as your common denominator and root of history. This is fundamentally distinct from, for example, Irish immigration, who kept their community, religion, and family ties both within the US and over the pond. There’s a lot written about this that you can explore independently.
I’m not actually a fan of “Black” in writing like this, mostly because it’s sloppily applied in a ctrl+f for lower case “black”, even at major institutions who should know better, but the case for it is a fairly strong one.
So dark-skinned Africans aren't "Black"? (But they are "black"?)
Why not just use black/white for skin tone, and African-American for "people in the US who have dark skin and whose ancestry is in the US"? Then for African immigrants, we can reference the specific nation of origin, e.g. "Ghanaian-American".
African descendants of slaves in America ("Blacks") are similarly distinct.
Both have undergone their own ethnogensis, which makes them distinct from their original continents (Europe, Africa).
Both are multi-racial ethnos as well (unlike many European and African countries).
If it's comparing a culture vs a 'non-culture' then that doesn't sound like for like.
For example in one case they showed data where sad faces were “mostly” black and asked people if they detect “bias”. Even if you saw more sad black people than white, would you reject the null hypothesis that it’s unbiased?
This unfortunately seems typical of the often very superficial “count the races” work that people claim is bias research.
> five conditions: happy Black/sad white; happy white/sad Black; all white; all Black; and no racial confound
The paper:
> five levels (underrepresentation of black subject images in the happy category, underrepresentation of white subject images in the happy category, black subject images only across both happy and unhappy categories, white subject images only across both happy and unhappy categories, and a balanced representation of both white and black subject images across both happy and unhappy categories)
These are not the same. It's impossible to figure out what actually took place from reading the article.
In fact what I'm calling the paper is just an overview of the (third?) experiment, and doesn't give the outcomes.
The article says "most participants in their experiments only started to notice bias when the AI showed biased performance". So they did, at that point, notice bias? This contradicts the article's own title which says they cannot identify bias "even in training data". It should say "but only in training data". Unless of course the article is getting the results wrong. Which is it? Who knows?
6 more comments available on Hacker News