DApp Store | Web3 Hub tapahtumille ja peleille

Trendaavat aiheet

Bonk Eco continues to show strength amid $USELESS rally

Pump.fun to raise $1B token sale, traders speculating on airdrop

Boop.Fun leading the way with a new launchpad on Solana.

BOOP−1,18 %

Boopa+2,43 %

PORK−3,75 %

jack morris

research @meta @cornell // language models, information theory, science of AI

jack morris15 tuntia sitten

i haven't heard it dicussed yet but AI basically killed hackathons. pretty much anything you could possibly make at a hackathon in 2019 can be built better and faster by AI in 2025

124,17K

jack morris16 tuntia sitten

this is bad code right?

15,08K

jack morris1.8. klo 04.12

probably 10x more people should be working on prompt optimization systems (we need a vLLM for promptopt), theory, new techniques, benchmarks. the whole kit and caboodle

30,35K

jack morris1.8. klo 00.51

here's some free alpha: if we do RL for too long after pretraining, we will surely overwrite parameters and start to forget things in the original instructGPT paper, their best model mixed RLHF with pretraining gradients to avoid exactly this model drift issue yet no one is doing this anymore. sure, it's one particular instantiation (gradient mixing) of a broader idea (avoiding forgetting) but seems like a greatly-overlooked line of thinking as we do more and more steps of RL for example see the recent ProRL paper. they're doing over 1000 steps of GRPO now with a non-trivial learning rate and no penalty for deviating from the original model. the circuits built inside the model during pretraining are surely starting to decay. and if not, they will after 10k or 100k RL steps i suspect this idea will come back around eventually; they're probably already doing this at the big labs

51,14K

jack morris31.7.2025

i'm looking for good examples of reasoning model generalization for example, a model incentivized via RL to think for a while and solve math problems gets better at creative writing is this common?

21,44K

jack morris31.7.2025

this seems really important: it is totally plausible that a model could get IMO gold without *any* reinforcement learning, given a perfectly-crafted prompt we just don't know, and lack tools to efficiently search through prompt space. glad to see at least someone is trying

Lakshya A Agrawal29.7.2025

How does prompt optimization compare to RL algos like GRPO? GRPO needs 1000s of rollouts, but humans can learn from a few trials—by reflecting on what worked & what didn't. Meet GEPA: a reflective prompt optimizer that can outperform GRPO by up to 20% with 35x fewer rollouts!🧵

36,21K

jack morris30.7.2025

you can't make this stuff up

407,76K

jack morris29.7.2025

hypothetical situation - i am an AI company that's reduced the cost of transferring and storing models to zero. i can serve each user their own model with no overhead what do i do? directly SFT user-specific models on their data? or RLHF on the chat ratings? something else?

16,58K

jack morris29.7.2025

the human brain reserves 40% of its processing exclusively for vision. modern LLMs somehow evolved without this entirely

jack morris29.7.2025

very surprising that fifteen years of hardcore computer vision research contributed ~nothing toward AGI except better optimizers we still don't have models that get smarter when we give them eyes

44,22K

jack morris kirjasi uudelleen

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭28.7.2025

This one dude I know (allegedly) poisoned the global AI training data corpus with self-propagating trigger-activated sleeper jailbreak payloads

471,7K

Johtavat

Rankkaus

Suosikit

Ketjussa trendaava

Trendaa X:ssä

Viimeisimmät suosituimmat rahoitukset

Merkittävin