This is bizarre, wasn't Google the one who claimed the name and did it first?

TeMPOraL · 2025-02-16T09:41:17 1739698877

Gemini was also "use us through this weird interface and also you can't if you're in the EU"; that + being far behind OpenAI and Anthropic for the past year means, they failed to reach notoriety, partly because of their own choices.

CjHuber · 2025-02-16T11:15:42 1739704542

Honestly I don‘t get why everybody is saying Gemini is far behind. Like for me Gemini Flash Thinking Experimental performs far far better then o3 mini

DebtDeflation · 2025-02-16T12:21:46 1739708506

There's a lot of mental inertia combined with an extremely fast moving market. Google was behind in the AI race in 2023 and a good chunk of 2024. But they largely caught up with Gemini 1.5, especially the 002 release version. Now with Gemini 2 they are every bit as much of a frontier model player as OpenAI and Anthropic, and even ahead of them in a few areas. 2025 will be an interesting year for AI.

hansworst · 2025-02-16T16:37:14 1739723834

Arguably Google is ahead. They have many non-llm uses (waymo/deepmind etc) and they have their own hardware, so not as reliant on Nvidia.

tim333 · 2025-02-16T17:12:43 1739725963

Demis Hassabis isn't very promotional. The other guys make more noise.

tr3ntg · 2025-02-16T11:24:21 1739705061

Seconding this. I get really great results from Flash 2.0 and even Pro 1.5 for some things compared to OpenAI models.

And their 2.0 Thinking model is great for other things. When my task matters, I default to Gemini.

jaggs · 2025-02-16T17:18:56 1739726336

I find the problem with Gemini is the rate limits. Really constrictive.

robwwilliams · 2025-02-16T20:19:54 1739737194

I can tell you why I just stopped using Gemini yesterday.

I was interested in getting simple summary data on the outcome of the recent US election and asked for an approximate breakdown of voting choices as a function age brackets of voters.

Gemini adamantly refused to provide these data. I asked the question four different ways. You would think voting outcomes were right up there with Tiananmen Square.

ChatGPT and Claude were happy to give me approximate breakdowns.

What I found interesting is that the patterns if voting by age are not all that different from Nixon-Humphrey-Wallace in 1968.

unsignedint · 2025-02-21T21:21:06 1740172866

Gemini's guardrails are unnecessarily strict. As you mentioned, there's a topical restriction on election-related content, and another where it outright refuses to process images containing anything resembling a face. I initially thought Copilot was bad in this regard—it also censors election-related questions to some extent, but not as aggressively as Gemini. However, Gemini's defensiveness on certain topics is almost comical. That said, I still find it to be quite a capable model overall.

TeMPOraL · 2025-02-16T13:59:24 1739714364

It was far behind. That's what I kept hearing on the Internet until maybe a couple weeks ago, and it didn't seem like a controversial view. Not that I cared much - I couldn't access it anyway because I am in the EU, which is my main point here: it seems that they've improved recently, but at that point, hardly anyone here paid it any attention.

Now, as we can finally access it, Google has a chance to get back into the race.

Kye · 2025-02-16T12:33:25 1739709205

It varies a lot for me. One day it takes scattered documents, pasted in, and produces a flawless summary I can use to organize it all. The next, it barely manages a paragraph for detailed input. It does seem like Google is quick to respond to feedback. I never seem to run into the same problem twice.

lambdaba · 2025-02-16T14:40:07 1739716807

> It does seem like Google is quick to respond to feedback.

I'm puzzled as to how that would work, when people talk about quick changes in model behavior. What exactly is being adjusted? The model has already been trained. I would think it's just randomness.

Kye · 2025-02-16T16:22:50 1739722970

Magic

And fine tuning.

Choose your fighter...

High level overview: https://www.datacamp.com/tutorial/fine-tuning-large-language...

More detail: https://www.turing.com/resources/finetuning-large-language-m...

Nice charts: https://blogs.oracle.com/ai-and-datascience/post/finetuning-...

The big platforms also seem to employ an intermediate step where they rewrite your prompt. I've downloaded my ChatGPT data and found substantial changes from what I wrote. Usually for the better. Changes to the way it rewrites changes the results.

brookst · 2025-02-16T16:30:52 1739723452

System prompts have a huge impact on output. Prompts for ChatGPT/etc are around a thousand words, with examples of what to do and what not to do. Minor adjustments there can make a big difference.

jaggs · 2025-02-16T17:19:54 1739726394

I've found this as well. On a good day Gemini is superb. But otherwise, awful. Really weird.

xiphias2 · 2025-02-16T15:27:38 1739719658

o3 mini is still behind o1 pro, it didn't impress me.

I think the people who think anybody is close to OpenAI don't have pro subscription

viraptor · 2025-02-16T17:57:40 1739728660

The $200 version? It's interesting that it exists, but for normal users it may as well... not. I mean, pro is effectively not a consumer product and I'd just exclude it from comparison of available models until you can pay for a single query.

taf2 · 2025-02-16T23:18:18 1739747898

It’s speed makes it better for me to iterate … o1 pro is just too slow or not yet good enough to wait 5 minutes…

hhh · 2025-02-16T18:19:08 1739729948

o3-mini isn't meant to compete with o1, or o1 pro mode.