Having tried this in the past, it can work pretty well 90% of the time. However, there are still some areas it will struggle.
Imagine you are trying to read a lease contract. The two areas which the LLM may be useless are numbers and names (names of people or places/addresses). There’s no way for your LLM to accurately know what the rent should be, or to know about the name of a specific person.
Agreed, this should not be used for anything mission critical unless you're going to sit there and carefully review the output by hand (although that is still going to be 100x faster than trying to manually correct the raw OCR output).
Where it's most useful to me personally is when I want to read some old book from the 1800s about the history of the Royal Navy [0] or something like that which is going to look really bad on my Kindle Oasis as a PDF, and the OCR version available from Archive.org is totally unreadable because there are 50 typos on each page. The ability to get a nice Markdown file that I can turn into an epub and read natively is really nice, and now cheap and fast.
If theres 30 fields on a document @ 90% accuracy - each field would still need to be validated by a human because you can't trust that it is correct. So the O(n) human step of checking each field is still there, and for fields that are long strings that are pseudo-random looking (think account numbers, numbers on invoices and receipts, instrumentation measurement values, etc.) there is almost no time savings because the mental effort to input something like 015729042 is about the same as verifying it is correct.
Let's say you're OCRing a contract. Odds are good that almost every part of the contract is there for an important reason, though it may not matter to you. How many errors can you tolerate in the terms of a contract that governs i.e. your home, or the car you drive to work, or your health insurance coverage? Do you want to take a gamble on those terms that could - in the worst case - result in getting kicked out of your apartment or having to pay a massive medical bill yourself?
The important question is which parts are inaccurate. If it's messing up names and numbers but is 99.9% accurate for everything else, you can just go back and check all the names and numbers at the end. But if the whole thing is only 90% accurate, you now either recheck the whole document or you risk a 'must' turning into a 'may' in a critical place that undermines the whole document.
Imagine you are trying to read a lease contract. The two areas which the LLM may be useless are numbers and names (names of people or places/addresses). There’s no way for your LLM to accurately know what the rent should be, or to know about the name of a specific person.