Excited to release DailyBench! DailyBench is an automated 4x daily benchmark that evaluates frontier model APIs on a fork of HELMLite. I built DailyBench to see if we could detect model providers quantizing weights, compressing the kv-cache, or swapping models during peak loads.
25,43K