This model struggles with reasoning tasks Opus does wonderfully with.
A cheaper GPT-4 that's this good? Neat, I guess.
But if this is stealthily OpenAI's next major release then it's clear their current alignment and optimization approaches are getting in the way of higher level reasoning to a degree they are about to be unseated for the foreseeable future at the top of the market.
To me, it seemed a bit better than GPT-4 at some coding task, or at least less inclined to just give the skeleton and leave out all the gnarly details, like GPT-4 likes to do these days. What frustrates me a bit is that I cannot really say if GPT-4, as it was in the very beginning when it happily executed even complicated and/or large requests for code, wasn't on the same level as this model actually, maybe not in terms of raw knowledge, but at least in term of usefulness/cooperativeness.
This aside, I agree with you that it does not feel like a leap, more like 4.x.
If you had used GPT-4 from the beginning, the quality of the responses would have been incredibly high. It also took 3 minutes to receive a full response.
And prompt engineering tricks could get you wildly different outputs to a prompt.
Using the 3xx ChatGPT4 model from the API doesn't hold a candle to the responses from back then.
I hear this somewhat often from people (less so nowadays) but before and after prompt examples are never provided. Do you have some example responses saved from the olden days, by chance? It would be quite easy to demonstrate your point if you did.
perhaps open source gpt3.5/4? I remember OpenAI had that in plans - if so, it would make sense for them to push alignment higher than with their closed models
I'm seeing a big leap in performance for coding problems. Same feeling as GPT-3.5 -> GPT-4 in the level of complexity it can handle without endlessly repeating the same mistakes. Inference is slow. Would not be surprised if this was GPT-4.5 or GPT-5.
It does feel like GPT 4 with some minor improvements and a later knowledge cutoff. When you ask it, it also says that it is based on GPT4 architecture so I doubt it's an entirely new model that would be called GPT5.
This model struggles with reasoning tasks Opus does wonderfully with.
A cheaper GPT-4 that's this good? Neat, I guess.
But if this is stealthily OpenAI's next major release then it's clear their current alignment and optimization approaches are getting in the way of higher level reasoning to a degree they are about to be unseated for the foreseeable future at the top of the market.
(Though personally, I just think it's not GPT-5.)