AI-Generated Medical Data Can Sidestep Usual Ethics Review, Universities Say
Posted4 months agoActive4 months ago
nature.comResearchstory
skepticalnegative
Debate
20/100
AI in HealthcareSynthetic DataResearch Ethics
Key topics
AI in Healthcare
Synthetic Data
Research Ethics
Universities argue that AI-generated medical data can bypass traditional ethics review, sparking concerns about data quality and potential biases, as commenters question the reliability of synthetic data generated from potentially flawed real data.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
2h
Peak period
1
1-2h
Avg / period
1
Key moments
- 01Story posted
Sep 14, 2025 at 3:28 PM EDT
4 months ago
Step 01 - 02First comment
Sep 14, 2025 at 5:05 PM EDT
2h after posting
Step 02 - 03Peak activity
1 comments in 1-2h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 14, 2025 at 5:05 PM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45242543Type: storyLast synced: 11/17/2025, 2:04:01 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
famously, "garbage in, garbage out"
but thanks to AI, we now have the exciting innovation that you can inject garbage into the middle of the process.
you have data from actual humans. it has some statistical properties.
you could look at those statistical properties, and do research on them, looking for hidden correlations or whatever. that's been possible for decades, no need for LLMs.
or, you can take those statistical properties, ask a chatbot to generate synthetic data based on them, and then do research on that synthetic data. but...why?
any valid conclusions from the research will be based on the statistical properties that were already there in the original data. the extra step of using the LLM gains nothing, and adds risk of the research being faulty because it found some correlation that the LLM made up.
this is like taking an image, saving it as a JPEG with 5% quality (or some other lossy process), and then asking an AI to upscale and enhance it for you. in the best-case, all you get is a reconstruction of the original. and realistically you'll almost certainly introduce misleading artifacts and noise.
or, scramble an egg, take a picture, and ask the chatbot to generate a picture for you of what the unbroken egg might have looked like. maybe it'll do a decent job of it...but 5 minutes ago you had the unbroken egg in your hand.
LLMs cannot reverse entropy. they cannot unscramble the egg. you can easily add randomness to a data set, but you cannot easily remove it.