Looking for Hidden Gems in Scientific Literature
Key topics
The article discusses the potential of literature-based discovery to uncover hidden gems in scientific literature, and the discussion revolves around the challenges and opportunities of this approach, including its applications in various fields.
Snapshot generated from the HN discussion
Discussion Activity
Moderate engagementFirst comment
6d
Peak period
7
144-156h
Avg / period
3.7
Based on 11 loaded comments
Key moments
- 01Story posted
Nov 12, 2025 at 2:14 PM EST
about 2 months ago
Step 01 - 02First comment
Nov 18, 2025 at 12:56 PM EST
6d after posting
Step 02 - 03Peak activity
7 comments in 144-156h
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 19, 2025 at 3:56 AM EST
about 1 month ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Does anyone else feel as if this (admittedly rough) estimate is off by an order of magnitude?
CORE has 431M. https://core.ac.uk/data
Crossref has 165M. https://www.crossref.org/blog/2025-public-data-file-now-avai...
These datasets are all biased towards work published in the digital age, but it's important to note that work is coming out much faster now than it used to.
I'm curious, do you think it's an order of magnitude too low or too high?
IIRC the rate of publishing was superlinear thus the curve of actual publications goes faster than the quadratic function.
In this context folks might find a previous methodology from the Soviet era named TRIZ highly relevant - https://en.wikipedia.org/wiki/TRIZ
TRIZ (/trɪz/; Russian: теория решения изобретательских задач, romanized: teoriya resheniya izobretatelskikh zadach, lit. 'theory of inventive problem solving') is a methodology which combines an organized, systematic method of problem-solving with analysis and forecasting techniques derived from the study of patterns of invention in global patent literature.
TRIZ developed from a foundation of research into hundreds of thousands of inventions in many fields to produce an approach which defines patterns in inventive solutions and the characteristics of the problems which these inventions have overcome.
References:
TRIZ 40 Principles examples for various Domains - https://web.archive.org/web/20111203105442/http://www.triz-j...
TRIZ and Software - 40 Principle Analogies, Part 1 - https://web.archive.org/web/20120130205515/http://www.triz-j...
TRIZ and Software - 40 Principle Analogies, Part 2 - https://web.archive.org/web/20120131003258/http://www.triz-j...
Not sure if there is value of that approach in other more rigorous fields but in health for sure it does. The knowledge in health science is generally fragmented and a way to connect islands of knowledge has the potential to unlock a lot of value.
If you would like to see how this article ideas are applied in a playful manner in a web application you can visit: https://www.biovista.com/vizit/
Personally I'm a proponent of representing academic knowledge in knowledge graphs, and this site does just that - https://orkg.org/
I've just launched a site to find code repositories linked to academic papers and to summarise key paper attributes. In the future I intend to integrate a hypothesis generator - https://researchlit.com