Pulse
Key Features
Tech Stack
Key Features
Tech Stack
I'm glad you found a solution that worked for you, but this is pretty surprising to hear - our new model, chandra, saturates handwriting-heavy benchmarks like this one - https://www.datalab.to/blog/saturating-the-olmocr-benchmark ,and our production models are more performant than OSS.
Did you test some time ago? We've made a bunch of updates in the last couple of months. Happy to issue some credits if you ever want to try again - vik@datalab.to.
I paste screenshots into claude code everyday and it's incredible. As in, I can't believe how good it is. I send a screenshot of console logs, a UI and some HTML elements and it just "gets it".
So saying they "Suck" makes me not take your opinion seriously.
The parsed Markdown displays the following, despite the PDF having "Large accelerated filer" as the checked option:
`Large accelerated filer Non-accelerated filer Accelerated filer Smaller reporting company `
I am already seeing this trend in the recent releases of the native models (such as Opus 4.5, Gemini 3, and especially Gemini 3 flash).
It's only going to get better from here.
Another thing to note is, there are over 5 startups right now in YC portfolio doing the same thing and going after a similar/overlapping target market if I remember correctly.
That plus the ability to stitch together data extraction and business logics such as reconciliations for vendor payments or sales.
I think both these reasons are what's keeping all the OCR based companies going.
My only advice would be to figure out more USPs before native models eat your lunch. Congrats on the launch.
I've got one! The pdf of this out-of-print book is terrible: https://archive.org/details/oneononeconversa0000simo. The text is unreadably faint, and the underlying text layer is full of errors, so copy-paste is almost useless. Can your software clean it up?
(I'll email you a copy of the pdf for convenience since the internet archive's copy is behind their notorious lending wall)
If anyone is interested in the history of the family therapy movement—that is, the movement that started in the 1950s where psychotherapists started working with entire families rather than individual clients—this is a great book of interviews and incredibly readable.
From the chapter above, Jay Haley on Milton Erickson:
But, you know, the real tragedy with Erickson was he spent so much time over the years teaching hypnosis when he had a whole new school of thera- py to offer. People did not recognize the significance of his work until he was too old to really demon- Strate it
(I left in a couple of text glitches there...at least it's readable now!)
Who are your main competitors? Is Docuware one of them? Just asking because I would recommend using a tool like bloomberry to find companies that just started using or churned from document management tools like it: https://bloomberry.com/data/docuware/
https://news.ycombinator.com/item?id=42443022
I found that at the time no LLM was able to properly organize the text and understand footnotes structure, but non-AI OCR works very well, and restructuring (with some manual input) is largely feasible. Would be interested in what you can do with those footnotes (including, for good measure, footnotes-within-footnotes).
Regarding feeding text to LLMs, it seems they are often able to make sense of text when the layout follows the original, which means the OCR phase doesn't necessarily need to properly understand the structure of the source: rendering the text in a proper layout can be sufficient.
I worked on setting up a service that would do just that, but in the end didn't go live with it; but here's the examples page to show what I mean:
https://preview.adgent.com/#examples
This approach is very straightforward and fails rarely.
I guess I should thank you for saving my time? Plenty of others in this space.
Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.