Hacker News new | past | comments | ask | show | jobs | submit login

Well, that was probably jailbreaking. That's not really prompt injection, but the problem of letting a model execute some but not all instructions, which could get bamboozled by things like roleplaying. In contrast to jailbreaking, proper prompt injection is Bing having access to websites or emails, which just means the website gets copied into its context window, giving the author of the website potential "root access" to your LLM. I think this is relatively well fixable with quote tokens and RL.



Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: