OpenAI Baselines: ACKTR & A2C



We’re freeing two new OpenAI Baselines implementations: ACKTR and A2C. A2C is a synchronous, deterministic variant of Asynchronous Benefit Actor Critic (A3C) which we’ve discovered provides equivalent efficiency. ACKTR is a extra sample-efficient reinforcement finding out set of rules than TRPO and A2C, and calls for simplest reasonably extra computation than A2C in keeping with replace.


Leave a Comment

Your email address will not be published. Required fields are marked *