I would consider this a case of "expectation management"-based versioning. This ...

jstummbillig · 2025-03-25T18:47:40 1742928460

I think it's reasonable. The development process is just not really comparable to other software engineering: It's fairly clear that currently nobody really has a good grasp on what a model will be while they are being trained. But they do have expectations. So you do the training, and then you assign the increment to align the two.

8n4vidtmkvmk · 2025-03-26T05:27:45 1742966865

I figured you don't update the major unless you significantly change the... algorithm, for lack of a better word. At least I assume something major changed between how they trained ChatGPT 3 vs GPT 4, other than amount of data. But maybe I'm wrong.

eru · 2025-03-26T06:48:24 1742971704

The number is purely for marketing.

If you could get much better performance without changing the algorithm (eg just by scaling), you'd still bump the number.

KoolKat23 · 2025-03-25T20:04:15 1742933055

Funnily enough, from early indications (user feedback) this new model would've been worthy of the 3.0 moniker, despite what the benchmarks say.