Benchmarking the Most Reliable Document Parsing API
Postedabout 2 months agoActiveabout 2 months ago
tensorlake.aiTechstory
skepticalmixed
Debate
60/100
Document ParsingAPI BenchmarkingOcr
Key topics
Document Parsing
API Benchmarking
Ocr
The post benchmarks document parsing APIs, but the discussion is dominated by skepticism about the methodology and suggestions for alternative tools.
Snapshot generated from the HN discussion
Discussion Activity
Moderate engagementFirst comment
45m
Peak period
10
1-2h
Avg / period
3.5
Comment distribution14 data points
Loading chart...
Based on 14 loaded comments
Key moments
- 01Story posted
Nov 6, 2025 at 1:12 PM EST
about 2 months ago
Step 01 - 02First comment
Nov 6, 2025 at 1:58 PM EST
45m after posting
Step 02 - 03Peak activity
10 comments in 1-2h
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 6, 2025 at 4:24 PM EST
about 2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45838365Type: storyLast synced: 11/20/2025, 1:20:52 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
On Gemini and other VLMs - we excluded these models because they don't do visual grounding - aka they don't provide page layouts, bounding boxes of elements on the pages. This is a table stakes feature for use-cases customers are building with Tensorlake. It wouldn't be possible to build citations without bounding boxes.
On pricing - we are probably the only company offer a pure on-demand pricing without any tiers. With Tensorlake, you can get back markdown from every page, summaries of figures, tables and charts, structured data, page classification, etc - in ONE api call. This means we are running a bunch of different models under the hood. If you add up the token count, and complexity of infrastructure to build a complex pipeline around Gemini, and other OCR/Layout detection model I bet the price you would end up with won't be any cheaper than what we provide :) Plus doing this at scale is very very complex - it requires building a lot of sophisticated infrastructure - another source of cost behind modern Document Ingestion services.
It seems like such a crowded space and there are many tools doing document extraction, I wonder if there's anything particular pulling more attention into the space?