Hacker News new | past | comments | ask | show | jobs | submit login

This is super interesting. The author states "the strong version of Goodhart's law" as a fact, but does not provide a theorem which shows that it is true. This recent paper does the job.[0] The authors write about Goodhart's law in the context of AI alignment but they are clear that their theorem is much more broadly applicable:

> we provide necessary and sufficient conditions under which indefinitely optimizing for any incomplete proxy objective leads to arbitrarily low overall utility

> Our main result identifies conditions such that any misalignment is costly: starting from any initial state, optimizing any fixed incomplete proxy eventually leads the principal to be arbitrarily worse off.

[0]: https://proceedings.neurips.cc/paper/2020/hash/b607ba543ad05...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: