Interpretable Function Approximation with Gaussian Processes in Value-Based Model-Free Reinforcement Learning

Published in Proceedings of the 6th Northern Lights Deep Learning Conference (NLDL), 2025

Recommended citation: Lende, M.v.d., Sabatelli, M. &; Cardenas-Cartagena, J. (2025). Interpretable Function Approximation with Gaussian Processes in Value-Based Model-Free Reinforcement Learning. Proceedings of the 6th Northern Lights Deep Learning Conference (NLDL), in Proceedings of Machine Learning Research 265:141-154 Available from https://proceedings.mlr.press/v265/lende25a.html. https://proceedings.mlr.press/v265/lende25a.html

Estimating value functions in Reinforcement Learning (RL) for continuous spaces is challenging. While traditional function approximators, such as linear models, offer interpretability, they are limited in their complexity. In contrast, deep neural networks can model more complex functions but are less interpretable. Gaussian Process (GP) models bridge this gap by offering interpretable uncertainty estimates while modeling complex nonlinear functions. This work introduces a Bayesian nonparametric framework using GPs, including Sparse Variational (SVGP) and Deep GPs (DGP), for off-policy and on-policy learning. Results on popular classic control environments show that SVGPs/DGPs outperform linear models but converge slower than their neural network counterparts. Nevertheless, they do provide valuable insights when it comes to uncertainty estimation and interpretability for RL.