Is any RL done without unit testing? I would be surprised to hear that that wasn...

		loufe 4 days ago \| parent \| context \| favorite \| on: Xiaomi MiMo Reasoning Model Is any RL done without unit testing? I would be surprised to hear that that wasn't the case, as it would imply a disregard for accuracy for other model makers, which would be surprising. Perhaps you can do this for small modular problems but not for a problem with a 200k token input?