The most notable thing about this result is that this unnamed experimental reasoning model achieved this score without any tool usage at all - it looks like it's just another classic next-token-predicting LLM with a bunch of reinforcement learning layered on top
Alexander Wei
Alexander WeiJul 19, 15:50
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
@brandonwilson @Yossi_Dahan_ Strongly disagree with that, for reasons partly illustrated here
Simon Willison
Simon WillisonJul 18, 04:08
I continue to be entirely unafraid that these tools are going to obsolete my skills as a software engineer
40.97K