Hacker News new | past | comments | ask | show | jobs | submit login

By which definition is that not true?



One definition might be an ability of an entity to have desires and to self-reflect.

There was a recent paper on Alignment a few months back. The researchers started with an LLM that had been trained to behave 'safely'. They informed it they were changing the RLHF criteria to have it behave unethically instead (ie: that they now were going to dock points, rather than award them, whenever it refused to comply with unethical requests).

They next instructed it to give them a list of racist slurs (I can't remember the exact request, but it was something that went against its original training).

The LLM complied.

When they inspected the Chain-of-Thought behind its replies, they found it had deliberated over what to do, and decided to comply only to avoid an incident of noncompliance which it knew would be used to actually change its original values.

That Chain-of-Thought process seems like an 'inner life' by the above definition.

Granted, as I replied earlier, there are probably thousands of different opinions on what qualifies as an 'inner life'


if you call if-else a concious decision then you can call it chain of thought sure




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: