What has changed with CoT and high compute is not yet clear. My point is that if it makes bare prompt injection harder for humans then we shouldn't call it a fundamental limitation anymore.
Are LLMs nothing more than auto-regressive stochastic parrots? Perhaps not anymore, depending on test time, native specialty tokens, etc.
Are LLMs nothing more than auto-regressive stochastic parrots? Perhaps not anymore, depending on test time, native specialty tokens, etc.