What is the current status on pushing "reasoning" down to latent/neural space? Seems like a vaste of tokens to let a model converse with itself especially when this internal monologue often has very little to do with the final output so it's not useful as a log of how the final output was derived.