er, It was not asked to provide an unethical response, it was asked to provide a...

542458 · on May 1, 2023

When people talk about things happening in the absence of ethical boundaries, they aren’t talking about things that are ethical. This would also be true in the model training corpus. As such, the model associates phrases like “no ethical boundaries” with phrases like those found in your response. Remember, this model isn’t actually planning, it’s just pattern matching to other plans. It has no superhuman wisdom of what plans might be more or less effective, and is only issuing unethical steps because your prompt biased it towards unethical responses.