Hacker News new | past | comments | ask | show | jobs | submit login

I use Google lens for OCR 15th century Latin books — then paste to ChatGPT and ask to correct OCR errors. Spot checking, it is very reliable.

Then translation can occur




Yes, the dream is to fully automate the entire pipeline, then let it loose on a massive collection of scanned manuscripts and come back in a couple days to perfect markdown formatted copies. I wish they would run my project on all the books on Archive.org because the current OCRed output is not usable generally.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: