Hacker News new | past | comments | ask | show | jobs | submit login

A modern GPT of any serious size outputs logits from a big-ass classifier over token vocabulary. These exist in a space, one can not only posit but empirically calculate a manifold with some nontrivial convexity properties, it’s a well-posed if not outright solved problem which LLM wrote something (up to telling it to use a certain manner).

This was a problem not only studied but in which fast and impressive progress was happening until they just turned it off.

It’s a fucking gigantic business to be the best at this. And it’s exactly what a startup should be: unlikely to have a well-heeled incumbent competitor not because no well-heeled firms ignore the market, but because they actively don’t want it to exist.




Can you explain more about this and why this would be useful? From your description it seems like a huge percentage of requests would alter the output enough to prevent specific LLM detection. Also, with so many new LLMs using synthetic and generated data, I'd imagine that throwing a wrench in things too.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: