Hacker News new | past | comments | ask | show | jobs | submit login

I wonder if the authors have tried incorporating error feedback from Lean into their models.

Work from 2023 [1] showed general purpose models did better when they were able to incorporate error feedback, humans incorporate error feedback, but none of the SOTA models on minif2f seem to.

[1]: https://arxiv.org/abs/2310.04353






In a way, DeepSeek Prover's subgoal decomposition is a partial-step towards error/proof-state feedback. (DS Prover breaks down a proof into subgoals and attacks each subgoal separately with batched sampling, then puts the pieces back together.)

This is distinct from the approach of the previous SOTA for an open-weights model (Kimina Prover) which generated at the full-proof level.

While it was very impressive to see Kimina's ability to generate medium-length proofs (think AIME-level problems) without sub-goals or feedback at intermediate steps, it's likely that at least subgoal decomposition will be required for longer proofs (think IMO-level problems.)

I certainly agree that where and how error/proof state feedback is best incorporated (training data synthesis / reward function / CoT during inference / etc.) is a fascinating area of research. (It's rumored that GDM's AlphaProof does use proof state / lean feedback already.)


That's surprising to learn.

I'm surprised those even use actual lean code instead of like raw type theory.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: