It's surprisingly difficult to figure out whether OpenAI and Google DeepMind actually got an IMO Gold "fair and square" or not. Looking forward to more analysis.
Jasper Dekoninck
Jasper Dekoninck22.7. klo 17.20
Interesting approach! However, we looked at the proofs and methodology and we found a few problems, specifically with the use of hints given to the model. While the scaffold indeed improves performance, it does not solve all problems accurately and would not get a gold medal.🧵
1,36K