Show HN: A transparent, multi-source news analyzer
Mood
thoughtful
Sentiment
positive
Category
tech
Key topics
news analysis
media bias
AI transparency
The author introduces a news analysis tool that aggregates multiple sources and provides transparent reasoning, sparking interest in its potential to reduce media bias.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
43s
Peak period
2
Hour 1
Avg / period
2
Based on 2 loaded comments
Key moments
- 01Story posted
11/18/2025, 12:52:48 PM
9h ago
Step 01 - 02First comment
11/18/2025, 12:53:31 PM
43s after posting
Step 02 - 03Peak activity
2 comments in Hour 1
Hottest window of the conversation
Step 03 - 04Latest activity
11/18/2025, 12:56:12 PM
9h ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
I’ve been working on a system for people who want to understand what actually happened in a news story—without trusting a single outlet or a single summary.
Instead of producing another “AI summary,” the goal is to make the entire chain of reasoning transparent:
1. Pull multiple articles for the same event (left, center, right, wires, gov).
2. Extract atomic claims from all of them.
3. Retrieve the relevant evidence passages.
4. Run an MNLI model to classify each claim as Supported / Contradicted / Inconclusive.
5. Show a full receipt trail for every claim (source, quote, timestamp).
The output is less like “news” and more like a structured evidence map of the story.
Links (no signup): • News pages: https://neutralnewsai.com
• Analyzer (paste any URL): https://neutralnewsai.com/analyzer
• Methodology: https://neutralnewsai.com/methodology
Instead of focusing on “neutral summaries,” I’ve shifted to emphasizing transparency + multi-source evidence. The summary is just the last layer; the real value is in surfacing contradictions, missing context, and uncertainty.
I’m also working on:
• A browser extension that runs the analysis on whatever article you’re reading.
• A white-label API that outputs claims + evidence + MNLI verdicts for researchers / journalists.
How it works (technical overview)
Crawling / dedup
Scheduled scrapers + curated source lists. Clustering based on title/body similarity.
Claim extraction
Sentence segmentation → classifier that detects check-worthy clauses (entities, counts, events, quotes, temporal markers).
Evidence retrieval
Sliding window over the article text + heuristics for merging overlapping snippets.
Fact-checking
DeBERTa-based MNLI model over (claim, passage). I’m currently experimenting with better aggregation for multi-passages.
Signals
Bias / sentiment / subjectivity / readability. Transformer classifiers + lightweight feature set.
Stack
Backend in Python + PostgreSQL; front-end in Angular. Server-rendered article pages for SEO + speed.
Where I’m unsure / what I’d love feedback on
1. MNLI limits At what point should I move from vanilla MNLI to something more retrieval-augmented or fine-tuned for journalism-style claims?
2. Claim extraction reliability Is it worth moving toward a more formal IE pipeline (NER + relation extraction + event frames), or does that add more complexity than it solves?
3. Uncertainty communication How would you present “inconclusive” or low-confidence cases to non-technical readers without misleading them?
4. Evaluation methodology What would a convincing benchmark look like? I have offline accuracy for several classifiers, but I haven’t found good public datasets specifically for multi-source contradictory claims.
If you see conceptual flaws or think this approach is risky, I’m genuinely open to hearing strong arguments against it.
Thanks for reading, Marcell
I’ve been working on a system for people who want to understand what actually happened in a news story—without trusting a single outlet or a single summary.
Instead of producing another “AI summary,” the goal is to make the entire chain of reasoning transparent:
1. Pull multiple articles for the same event (left, center, right, wires, gov). 2. Extract atomic claims from all of them. 3. Retrieve the relevant evidence passages. 4. Run an MNLI model to classify each claim as Supported / Contradicted / Inconclusive. 5. Show a full receipt trail for every claim (source, quote, timestamp).
The output is less like “news” and more like a structured evidence map of the story.
Links (no signup): • News pages: https://neutralnewsai.com • Analyzer (paste any URL): https://neutralnewsai.com/analyzer • Methodology: https://neutralnewsai.com/methodology
Instead of focusing on “neutral summaries,” I’ve shifted to emphasizing transparency + multi-source evidence. The summary is just the last layer; the real value is in surfacing contradictions, missing context, and uncertainty.
I’m also working on: • A browser extension that runs the analysis on whatever article you’re reading. • A white-label API that outputs claims + evidence + MNLI verdicts for researchers / journalists.
How it works (technical overview)
Crawling / dedup Scheduled scrapers + curated source lists. Clustering based on title/body similarity.
Claim extraction Sentence segmentation → classifier that detects check-worthy clauses (entities, counts, events, quotes, temporal markers).
Evidence retrieval Sliding window over the article text + heuristics for merging overlapping snippets.
Fact-checking DeBERTa-based MNLI model over (claim, passage). I’m currently experimenting with better aggregation for multi-passages.
Signals Bias / sentiment / subjectivity / readability. Transformer classifiers + lightweight feature set.
Stack Backend in Python + PostgreSQL; front-end in Angular. Server-rendered article pages for SEO + speed.
⸻
Where I’m unsure / what I’d love feedback on
1. MNLI limits At what point should I move from vanilla MNLI to something more retrieval-augmented or fine-tuned for journalism-style claims?
2. Claim extraction reliability Is it worth moving toward a more formal IE pipeline (NER + relation extraction + event frames), or does that add more complexity than it solves?
3. Uncertainty communication How would you present “inconclusive” or low-confidence cases to non-technical readers without misleading them?
4. Evaluation methodology What would a convincing benchmark look like? I have offline accuracy for several classifiers, but I haven’t found good public datasets specifically for multi-source contradictory claims.
If you see conceptual flaws or think this approach is risky, I’m genuinely open to hearing strong arguments against it.
Thanks for reading, Marcell
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.