Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I want to rejoice that OCR is now a "solved" problem, but I feel like hallucinations are just as problematic as the kind of stuff I have to put up with tesseract -- both require careful manual proofreading for an acceptable degree of confidence. I guess I'll have to try it and see for myself just how much better these solutions are for my public domain archive.org Latin language reader & textbook projects.


It depends on your use-case. For mine, I'm mining millions of scanned PDF pages to get approximate short summaries of long documents. The occasional hallucination won't damage the project. I realize I'm an outlier, and I would obviously prefer a solution that was as accurate as possible.


possibly doing both & diffing the output to spot contested bits?


that’s my current idea, use two different ocr models and diff the results to spot check for errors. at these prices why not?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: