"real improvements came from adjusting the prompts to make things clearer for th... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

kbyatnal 9 months ago | parent | context | favorite | on: Show HN: LLM-aided OCR – Correcting Tesseract OCR ...

"real improvements came from adjusting the prompts to make things clearer for the model, and not asking the model to do too much in a single pass"

This is spot on, and it's the same as how humans behave. If you give a human too many instructions at once, they won't follow all of them accurately.

I spend a lot of time thinking about LLMs + documents, and in my opinion, as the models get better, OCR is soon going to be a fully solved problem. The challenge then becomes explaining the ambiguity and intricacies of complex documents to AI models in an effective way, less so about the OCR capabilities itself.

disclaimer: I run a LLM document processing company called Extend (https://www.extend.app/).

saaaaaam 9 months ago | [–]

Extend looks great - and your real estate play is very interesting. I’ve been playing around extracting key terms from residential leasehold (condominium-type) agreements. Interested to know if you’re doing this sort of thing?

sumedh 9 months ago | [–]

Is there a pricing page?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact