Don't forget that this is driven by present-day AI. Which means people will assume that it's checking for fraud and incorrect logic, when actually it's checking for self-consistency and consistency with training data. So it should be great for typos, misleading phrasing, and cross-checking facts and diagrams, but I would expect it to do little for manufactured data, plausible but incorrect conclusions, and garden variety bullshit (claiming X because Y, when Y only implies X because you have a reasonable-sounding argument that it ought to).
Not all of that is out of reach. Making the AI evaluate a paper in the context of a cluster of related papers might enable spotting some "too good to be true" things.
Hey, here's an idea: use AI for mapping out the influence of papers that were later retracted (whether for fraud or error, it doesn't matter). Not just via citation, but have it try to identify the no longer supported conclusions from a retracted paper, and see where they show up in downstream papers. (Cheap "downstream" is when a paper or a paper in a family of papers by the same team ever cited the upstream paper, even in preprints. More expensive downstream is doing it without citations.)
> people will assume that it's checking for fraud and incorrect logic, when actually it's checking for self-consistency and consistency with training data.
> Are you actually claiming with a straight face that not a single human can check for fraud or incorrect logic?
No of course not, I was pointing out that we largely check "for self-consistency and consistency with training data" as well. Our checking of the coherency of other peoples work is presumably an extension of this.
Regardless, computers already check for fraud and incorrect logic as well, albeit in different contexts. Neither humans or computers can do this with general competency, i.e. without specific training to do so.
To be fair, at least humans get to have collaborators from multiple perspectives and skillsets; a lot of the discussion about AI in research has assumed that a research team is one hive mind, when the best collaborations aren’t.
There are some patterns in natural data, and often also some in modified or fabricated data, that one can check against. None of them are perfect of course, but it can raise suspicion at least.
Not all of that is out of reach. Making the AI evaluate a paper in the context of a cluster of related papers might enable spotting some "too good to be true" things.
Hey, here's an idea: use AI for mapping out the influence of papers that were later retracted (whether for fraud or error, it doesn't matter). Not just via citation, but have it try to identify the no longer supported conclusions from a retracted paper, and see where they show up in downstream papers. (Cheap "downstream" is when a paper or a paper in a family of papers by the same team ever cited the upstream paper, even in preprints. More expensive downstream is doing it without citations.)