The hardest high school math exam in the world, the 6 problem 9 hour IMO 2025, was this week. AI models performed poorly. Gemini 2.5 Pro scored the highest, just 13/42, costing $431.97, in a best of 32 eval. Bronze cutoff was 19. Long way to go for AI to solve hard Math.
Here's a more beautiful visualization of model performance on MathArena
P6 was definitely the hardest and most interesting problem. Most people can understand it, but very few can solve it. All models scored 0/7.
Small correction:
Alexander Wei
Alexander Wei14 tuntia sitten
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
76,37K