A Distributional Analogue to the Successor Representation

Harley Wiltzer ^✢²³

Jesse Farebrother ^✢¹²³

Arthur Gretton ¹⁴

Yunhao Tang ¹

André Barreto ¹

Will Dabney ¹

Marc G. Bellemare ²³

Mark Rowland ¹

Abstract

This paper contributes a new approach for distributional reinforcement learning which elucidates a clean separation of transition structure and reward in the learning process. Analogous to how the successor representation (SR) describes the expected consequences of behaving according to a given policy, our distributional successor measure (SM) describes the distributional consequences of this behaviour. We formulate the distributional SM as a distribution over distributions and provide theory connecting it with distributional and model-based reinforcement learning. Moreover, we propose an algorithm that learns the distributional SM from data by minimizing a two-level maximum mean discrepancy. Key to our method are a number of algorithmic techniques that are independently valuable for learning generative models of state. As an illustration of the usefulness of the distributional SM, we show that it enables zero-shot risk-sensitive policy evaluation in a way that was not previously possible.

ICML 2024

Spotlight

arXiv open review code

^✢

Equal contribution

^✢

Equal contribution

¹ Google DeepMind

² McGill University

³ Mila - Québec AI Institute

⁴ Gatsby Unit, University College London

Citing

To cite this paper, please use the following reference:

@inproceedings{wiltzer24dsm,
	title        = {A Distributional Analogue to the Successor Representation},
	author       = {
		Wiltzer, Harley and Farebrother, Jesse and Gretton, Arthur and Tang, Yunhao
		and Barreto, André and Dabney, Will and Bellemare, Marc G. and Rowland, Mark
	},
	year         = 2024,
	booktitle    = {International Conference on Machine Learning (ICML)},
}