While working on an experiment, I found that the weights learned by my actors were not propagating back to the EnvRunner for the next round of rollouts. Instead, the agent in charge of rollouts ...
This replication package is licensed under a Creative Commons Attribution 4.0 International License. See LICENSE.txt for details. Citation: Farre-Mensa, Joan, Deepak Hegde, and Alexander Ljungqvist.