LLM PDF Ocr Markdown Book – Turn Scanned Pdfs Into Epub/kindle with LLM
Posted3 months ago
github.comTechstory
supportivepositive
Debate
0/100
LLMOcrEbook Conversion
Key topics
LLM
Ocr
Ebook Conversion
A GitHub project shares a tool to convert scanned PDFs to ePub/Kindle format using LLM and OCR, sparking interest in the community for its potential to make scanned books more accessible.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
N/A
Peak period
1
Start
Avg / period
1
Key moments
- 01Story posted
Sep 30, 2025 at 11:26 PM EDT
3 months ago
Step 01 - 02First comment
Sep 30, 2025 at 11:26 PM EDT
0s after posting
Step 02 - 03Peak activity
1 comments in Start
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 30, 2025 at 11:26 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45433974Type: storyLast synced: 11/17/2025, 12:08:14 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
You can feed it PNG/JPG pages directly or run pdftoppm -png -r 300 input.pdf output-prefix first. Usage, parameters, and setup (Python deps, pandoc, Calibre) are documented in the README. Source: [add your repo URL or archive link]. Feedback on robustness, model compatibility, and additional cleanup heuristics would be awesome!