Abstract: Actors and critics in actor-critic reinforcement learning algorithms are functionally separate, yet they often use the same network architectures. This case study explores the performance impact of network sizes when considering actor and critic architectures independently. By relaxing the assumption of architectural symmetry, it is often possible for smaller actors to achieve comparable policy performance to their symmetric counterparts. Our experiments show up to 97% reduction in the number of network weights with an average reduction of 64% over multiple algorithms on multiple tasks. Given the practical benefits of reducing actor complexity, we believe configurations of actors and critics are aspects of actor-critic design that deserve to be considered independently.
Recommended citation: Mysore, S., Mabsout, B., Mancuso, R., & Saenko, K. (2021). “Good Actors can come in Smaller Sizes: A Case Study on the Value of Actor-Critic Asymmetry,” arXiv preprint arXiv:2102.11893