Hacker News new | past | comments | ask | show | jobs | submit login

OpenAI compatible API is missing important parameters, for example I don't think there is a way to disable flash 2 thinking with it.

Vertex AI is for grpc, service auth, and region control (amongst other things). Ensuring data remains in a specific region, allowing you to auth with the instance service account, and slightly better latency and ttft






I find Google's service auth SO hard to figure out. I've been meaning to solve deploying to Cloud Run via service with for several years now but it just doesn't fit in my brain well enough for me to make the switch.

simonw, 'Google's service auth SO hard to figure out' – absolutely hear you. We're taking this feedback on auth complexity seriously. We have a new Vertex express mode in Preview (https://cloud.google.com/vertex-ai/generative-ai/docs/start/... , not ready for primetime yet!) that you can sign up for a free tier and get API Key right away. We are improving the experience, again if you would like to give feedback, please DM me on @chrischo_pm on X.

If you're on cloud run it should just work automatically.

For deploying, on GitHub I just use a special service account for CI/CD and put the json payload in an environment secret like an API key. The only extra thing is that you need to copy it to the filesystem for some things to work, usually a file named google_application_credentials.json

If you use cloud build you shouldn't need to do anything


You should consider setting up Workload Identity Federation and authentication to Google Cloud using your GitHub runner OIDC token. Google Cloud will "trust" the token and allow you to impersonate service accounts. No static keys!

Does not work for many Google services, including firebase

Yes it does. We deploy firebase and bunch of other GCP things from github actions and there are zero API keys or JSON credentials anywhere.

Everything is service accounts and workload identity federation, with restrictions such as only letting main branch in specific repo to use it (so no problem with unreviewed PRs getting production access).

Edit: if you have a specific error or issue where this doesn't work for you, and can share the code, I can have a look.


No thank you, there is zero benefit to migrating and no risk in using credentials the way I do.

How do you sign a firebase custom auth token with workload identity federation? How about a pre signed storage URL? Off the top of my head I think those were two things that don't work


You could post on Reddit asking for help and someone is likely to provide answers, an explanation, probably even some code or bash commands to illustrate.

And even if you don't ask, there are many examples. But I feel ya. The right example to fit your need is hard to find.


GCP auth is terrible in general. This is something aws did well

I don't get that. How?

- There are principals. (users, service accounts)

- Each one needs to authenticate, in some way. There are options here. SAML or OIDC or Google Signin for users; other options for service accounts.

- Permissions guard the things you can do in Google cloud.

- There are builtin roles that wrap up sets of permissions.

- you can create your own custom roles.

- attach roles to principals to give them parcels of permissions.


yeah bro just one more principal bro authenticate each one with SAML or OIDC or Google Signin bro set the permissions for each one make sure your service account has permissions aiplatform.models.get and aiplatform.models.list bro or make a custom role and attach the role to the principle to parcel the permission

It's not complicated in the context of huge enterprise applications, but for most people trying to use Google's LLMs, it's much more confusing than using an API key. The parent commenter is probably using an aws secret key.

And FWIW this is basically what google encourages you to do with firebase (with the admin service account credential as a secret key).


GCP auth is actually one of the things it does way better than AWS. it's just that the entire industry has been trained on AWS's bad practices...

From the linked docs:

> If you want to disable thinking, you can set the reasoning effort to "none".

For other APIs, you can set the thinking tokens to 0 and that also works.


Wow thanks I did not know

When I used the openai compatible stuff my API’s just didn’t work at all. I switched back to direct HTTP calls, which seems to be the only thing that works…

We built the OpenAI Compatible API (https://cloud.google.com/vertex-ai/generative-ai/docs/multim...) layer to help customers that are already using OAI library to test out Gemini easily with basic inference but not as a replacement library for the genai sdk (https://github.com/googleapis/python-genai). We recommend using th genai SDK for working with Gemini.

So, to be clear, Google only supports Python as a language for accessing your models? Nothing else?

We have Python/Go in GA.

Java/JS is in preview (not ready for production) and will be GA soon!


What about providing an actual API people can call without needing to rely on Google SDKs?

yeah, 2 days to get Google OAuth flow integrated into an background app/script, 1 day coding for the actual app ...

I got claude to write me an auth layer using only python http.client and cryptography. One shot no problem, now I can get a token from the service key any time, just have to track expiration. Annoying that they don't follow industry standard though.

Is this vertexAI related or in general, I find googles oauth flow to be extremely well documented and easy to setup…

should have used ai to write the integrations...

thats with AI

as there are so many variations out there the AI gets majorly confused, as a matter of fact, the google oauth part is the one thing that gemini 2.5 pro cant code

should be its own benchmark


Maybe you should just read the docs and use the examples there. I have used all kinds of GCP services for many years and auth is not remotely complicated imo.

JSONSchema support on Google's OpenAI-compatible API is very lackluster and limiting. My biggest gripe really.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: