Interpretable Function Approximation with Gaussian Processes in Value-Based Model-Free Reinforcement Learning
Published in Proceedings of the 6th Northern Lights Deep Learning Conference (NLDL), 2025
Recommended citation: Lende, M.v.d., Sabatelli, M. &; Cardenas-Cartagena, J. (2025). Interpretable Function Approximation with Gaussian Processes in Value-Based Model-Free Reinforcement Learning. Proceedings of the 6th Northern Lights Deep Learning Conference (NLDL), in Proceedings of Machine Learning Research 265:141-154 Available from https://proceedings.mlr.press/v265/lende25a.html. https://proceedings.mlr.press/v265/lende25a.html
Estimating value functions in Reinforcement Learning (RL) for continuous spaces is challenging. While traditional function approximators, such as linear models, offer interpretability, they are limited in their complexity. In contrast, deep neural networks can model more complex functions but are less interpretable. Gaussian Process (GP) models bridge this gap by offering interpretable uncertainty estimates while modeling complex nonlinear functions. This work introduces a Bayesian nonparametric framework using GPs, including Sparse Variational (SVGP) and Deep GPs (DGP), for off-policy and on-policy learning. Results on popular classic control environments show that SVGPs/DGPs outperform linear models but converge slower than their neural network counterparts. Nevertheless, they do provide valuable insights when it comes to uncertainty estimation and interpretability for RL.