We’ve advanced Random Community Distillation (RND), a prediction-based approach for encouraging reinforcement studying brokers to discover their environments via interest, which for the primary time exceeds reasonable human efficiency on Montezuma’s Revenge.

