I dunno about that in this case. The "confidently incorrect" problem seems inherent to the underlying algorithm to me. If it were solved, I suspect that would be a paradigm shift of the sort that happens on the years scale at best.
Yes, the "confidently incorrect" issue will be a tough nut to crack for the current spate of generative text models. LLMs have no ability to analyze a body of text and determine anything about it (e.g. how likely it is to be true); they are clever but at bottom can only extrapolate from patterns found in the training data. If no one has said anything like "X, and I'm 78% certain about it", then it's tough to imagine how an LLM could generate reasonably correct probability estimates.