The technical report does go into a lot of depth about how they use RL, such as ...

		ainch 2 days ago \| parent \| context \| favorite \| on: Xiaomi MiMo Reasoning Model The technical report does go into a lot of depth about how they use RL, such as the modified GRPO objective they use. As far as the README, I imagine most people active in the field understand the implications of "RL" for a reasoning model.