This black spatula case was pretty famous and was all over the internet. Is it possible that the AI is merely detecting something that was already in its training data?
I found many sources on the internet that state that the error in the black spatula paper was discovered by John Schwarcz of McGill University while reviewing a research paper:
Some things to note : this didn't even require a complex multi-agent pipeline. A single shot prompting was able to detect these errors.