Trendaavat aiheet
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.

jack morris
research @meta @cornell // language models, information theory, science of AI
here's some free alpha:
if we do RL for too long after pretraining, we will surely overwrite parameters and start to forget things
in the original instructGPT paper, their best model mixed RLHF with pretraining gradients to avoid exactly this model drift issue
yet no one is doing this anymore. sure, it's one particular instantiation (gradient mixing) of a broader idea (avoiding forgetting) but seems like a greatly-overlooked line of thinking as we do more and more steps of RL
for example see the recent ProRL paper. they're doing over 1000 steps of GRPO now with a non-trivial learning rate and no penalty for deviating from the original model. the circuits built inside the model during pretraining are surely starting to decay. and if not, they will after 10k or 100k RL steps
i suspect this idea will come back around eventually; they're probably already doing this at the big labs



51,14K
this seems really important:
it is totally plausible that a model could get IMO gold without *any* reinforcement learning, given a perfectly-crafted prompt
we just don't know, and lack tools to efficiently search through prompt space. glad to see at least someone is trying

Lakshya A Agrawal29.7.2025
How does prompt optimization compare to RL algos like GRPO?
GRPO needs 1000s of rollouts, but humans can learn from a few trials—by reflecting on what worked & what didn't.
Meet GEPA: a reflective prompt optimizer that can outperform GRPO by up to 20% with 35x fewer rollouts!🧵

36,21K
the human brain reserves 40% of its processing exclusively for vision. modern LLMs somehow evolved without this entirely

jack morris29.7.2025
very surprising that fifteen years of hardcore computer vision research contributed ~nothing toward AGI except better optimizers
we still don't have models that get smarter when we give them eyes
44,22K
Johtavat
Rankkaus
Suosikit
Ketjussa trendaava
Trendaa X:ssä
Viimeisimmät suosituimmat rahoitukset
Merkittävin