David Silver, who leads RL at DeepMind, said on a podcast a few months ago that DeepMind built a meta‑RL system that learned its own RL algorithm and beat all the human‑designed algorithms
9K