Trendaavat aiheet
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.

Matt Schlicht
Matt Schlicht kirjasi uudelleen
When you realize that open-source is at the frontier of AI despite:
- less GPUs
- less money
- less public and policy support
- no $100M salaries to attract talent
- with closed-source taking advantage and copying all the innovations of open-source without contributing back theirs
🤯🤯🤯
And we’re just getting started!
105,36K
I want an easy way to keep up with the HUNDREDS of new AI research that comes out on @arxiv every single day.
So I've been building something to help myself. Introducing @yesnoerror.
I would love to share it with you! ❤️
I have not published a paper myself, I didn't go to college, but I love AI and I love frontier technologies where people are trying things nobody has ever tried before. I feel lucky to be where I am in life, but I want to learn and push myself even more.
If you, like me, wish you could read and understand more about the latest developments in this amazing industry, you might also love this.
I have been building this in private beta and updating it in real time as I get feedback from researchers and leaders at @AnthropicAI @MIT @Yale @CarnegieMellon and more.
If you would like to be an early tester, please let me know 🧪🔬
The more feedback I get, the better we can make this, and the better we make this, the more informed and inspired a larger group of people can be.

2,67K
Waking up to see this new paper from @scale_AI charting on the @yesnoerror trending feed.
Authors: @anisha_gunjal, @aytwang, Elaine Lau, @vaskar_n, @BingLiu1011, and @SeanHendryx
"Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains"
Simplified: Teaching computers with detailed check-lists instead of vague thumbs-up ratings lets them learn better answers in medicine and science questions and makes it clear why they got a reward.
Key findings:
• Implicitly aggregated rubric rewards boost medical benchmark score by 28 % relative to Likert baseline.
• Matches or exceeds rewards based on expert reference answers despite using smaller judges.
What can it be used for:
• Fine-tuning clinical decision support chatbots with medical safety rubrics.
• Training policy-analysis or legal-reasoning models where multiple subjective factors matter.
Detailed summary:
Rubrics as Rewards (RaR) is proposed as an interpretable alternative to opaque preference-based reward models when fine-tuning large language models (LLMs) with reinforcement learning. Instead of asking humans to rank whole answers, domain experts (or a strong LLM guided by expert references) write a prompt-specific checklist of 7–20 binary criteria that capture essential facts, reasoning steps, style and common pitfalls. Each criterion is tagged Essential, Important, Optional, or Pitfall and given a weight. During on-policy training the policy model (Qwen-2.5-7B in the paper) samples 16 candidate answers per prompt. A separate judge LLM (GPT-4o-mini or smaller) is prompted either to score each criterion separately (explicit aggregation) or to read the full rubric and output one holistic Likert rating 1–10 (implicit aggregation). The normalized score becomes the scalar reward and the policy is updated with the GRPO algorithm.
The authors curate two 20 k-example training sets—RaR-Medical-20k and RaR-Science-20k—by combining existing medical and science reasoning corpora and generating synthetic rubrics with o3-mini or GPT-4o. Evaluation on HealthBench-1k (medical reasoning) and GPQA-Diamond (graduate-level physics/chemistry/biology) shows that RaR-Implicit yields up to a 28 % relative improvement over simple Likert-only rewards and matches or exceeds rewards computed by comparing to expert reference answers. Implicit aggregation consistently outperforms explicit, demonstrating that letting the judge decide how to combine criteria works better than fixed hand-tuned weights.
Rubric supervision also helps smaller judge models. When asked to rate preferred versus perturbed answers, rubric-guided judges choose the preferred answer far more reliably than equally sized Likert-only judges, narrowing the gap between a 7 B evaluator and GPT-4o-mini. Ablations reveal that prompt-specific rubrics beat generic ones, multiple criteria beat essential-only lists, and access to an expert reference while drafting rubrics materially boosts downstream performance. Even human-written and high-quality synthetic rubrics perform on par, suggesting scalability.
RaR generalises Reinforcement Learning with Verifiable Rewards (RLVR): when the rubric has just one correctness check, the framework collapses to RLVR’s exact-match reward. By exposing each aspect of quality explicitly, RaR is more transparent, auditable and potentially harder to reward-hack than neural reward models. The authors discuss extensions to real-world agentic tasks, dynamic curriculum via rubric weights, and formal robustness studies.
--
Over 500,000 pages of research are published on @arXiv every month. Hidden within are breakthrough insights that could transform your work — but finding them is like searching for diamonds in an ocean of data. @yesnoerror cuts through the noise to surface the most impactful research for your projects, investments, and discoveries.
// $yne

2,83K
Matt Schlicht kirjasi uudelleen
buried in @sriramk's America's AI Action Plan is endorsement that the US compute market will financialize with spot and forward contracts. this podcast explains why this is so necessary, not just for speculation
one of the most consistent themes with @latentspacepod's GPU infra/neocloud market coverage (see @evanjconrad/@sfcompute, @vipulved / @togethercompute, @picocreator / @featherlessai, @bernhardsson / @modal_labs but also @zjasper666's AIE talk) is that the status quo of 3 year lockin long term contracts with hyperscalers is causing unsustainable market volatility and inefficiency, not just in gpu prices and the rise and fall of startup fortunes, but also inefficiency in ideas and resources for open ai and research.
now the US government is fully behind this movement and most importantly, has demonstrated that they *get it*.

51,67K
Johtavat
Rankkaus
Suosikit
Ketjussa trendaava
Trendaa X:ssä
Viimeisimmät suosituimmat rahoitukset
Merkittävin