My theory is that since you can tune how agreeable a model is but since you can't make it more correct so easily, making a model that will agree with the user ends up being less likely to result in the model being confidently wrong and berating users.
After all, if it's corrected wrongly by a user and acquiesces, well that's just user error. If it's corrected rightly and keeps insisting on something obviously wrong or stupid, it's OpenAI's error. You can't twist a correctness knob but you can twist an agreeableness one, so that's the one they play with.
(also I suspect it makes it seem a bit smarter that it really is, by smoothing over the times it makes mistakes)
After all, if it's corrected wrongly by a user and acquiesces, well that's just user error. If it's corrected rightly and keeps insisting on something obviously wrong or stupid, it's OpenAI's error. You can't twist a correctness knob but you can twist an agreeableness one, so that's the one they play with.
(also I suspect it makes it seem a bit smarter that it really is, by smoothing over the times it makes mistakes)