Feature Search in the Grassmanian in Online Reinforcement Learning

Bhatnagar, Shalabh ; Borkar, Vivek S. ; K. J., Prabuchandran (2013) Feature Search in the Grassmanian in Online Reinforcement Learning IEEE Journal of Selected Topics in Signal Processing, 7 (5). pp. 746-758. ISSN 1932-4553

Full text not available from this repository.

Official URL: http://doi.org/10.1109/JSTSP.2013.2255022

Related URL: http://dx.doi.org/10.1109/JSTSP.2013.2255022

Abstract

We consider the problem of finding the best features for value function approximation in reinforcement learning and develop an online algorithm to optimize the mean square Bellman error objective. For any given feature value, our algorithm performs gradient search in the parameter space via a residual gradient scheme and, on a slower timescale, also performs gradient search in the Grassman manifold of features. We present a proof of convergence of our algorithm. We show empirical results using our algorithm as well as a similar algorithm that uses temporal difference learning in place of the residual gradient scheme for the faster timescale updates.

Item Type:Article
Source:Copyright of this article belongs to Institute of Electrical and Electronics Engineers.
Keywords:Feature Adaptation; Grassman Manifold; Online Learning; Residual Gradient Scheme; Stochastic Approximation; Temporal Difference Learning.
ID Code:116514
Deposited On:12 Apr 2021 06:06
Last Modified:12 Apr 2021 06:06

Repository Staff Only: item control page