Trendaavat aiheet
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.
My best guess:
Rubrics + LLM Judge - Atomize each point in the ground truth proof and check against the model output
My guess on how they made this scalable - as before it was not, humans had to meticulously craft them, is they trained or did something to make very good rubrics generated for each specific problem or its answer.

19.7. klo 15.50
5/N Besides the result itself, I am excited about our approach: We reach this capability level not via narrow, task-specific methodology, but by breaking new ground in general-purpose reinforcement learning and test-time compute scaling.
.@polynoamial @alexwei_ blink twice if I'm right and 3 times if I'm wrong - before the blind are led by the blind xD
22,25K
Johtavat
Rankkaus
Suosikit