More sightings of Number Five. This person has been given early access to GPT-5-reasoning (medium) for testing.
leo 🐈
leo 🐈2.8. klo 22.03
As you might've noticed above, I've had access to a version of GPT-5 early. It sets the new SoTA by a significant margin on this benchmark and does much better than o3-high. It's a great model. On the other hand, Anthropic's best model lags. Google's is middle of the pack.
46,6K