suppose you trained an RL agent to maximize reward across diverse environments then if you dropped it into a new environment, the first question it’d learn to ask is “what is my reward function here?” it might even learn to model the motives of its simulators to figure this out
“what is my goal/purpose” feels instrumentally convergent. I wonder if in some sense that’s why we seek god
24,72K