Hacker News new | past | comments | ask | show | jobs | submit login

"real improvements came from adjusting the prompts to make things clearer for the model, and not asking the model to do too much in a single pass"

This is spot on, and it's the same as how humans behave. If you give a human too many instructions at once, they won't follow all of them accurately.

I spend a lot of time thinking about LLMs + documents, and in my opinion, as the models get better, OCR is soon going to be a fully solved problem. The challenge then becomes explaining the ambiguity and intricacies of complex documents to AI models in an effective way, less so about the OCR capabilities itself.

disclaimer: I run a LLM document processing company called Extend (https://www.extend.app/).




Extend looks great - and your real estate play is very interesting. I’ve been playing around extracting key terms from residential leasehold (condominium-type) agreements. Interested to know if you’re doing this sort of thing?


Is there a pricing page?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: