This sounds way, way out of how LLMs work. They can't count the R's in strarwberrrrrry, but they can cross reference multiple tables of data? Is there something else going on here?
Accurately check: lol no chance at all, completely agreed.
Detect deviations from common patterns, which are often pointed out via common patterns of review feedback on things, which might indicate a mistake: actually I think that fits moderately well.
Are they accurate enough to use in bulk? .... given their accuracy with code bugs, I'm inclined to say "probably not", except by people already knowledgeable in the content. They can generally reject false positives without a lot of effort.