Multiscale Q-learning with linear function approximation

Bhatnagar, Shalabh ; Lakshmanan, K. (2016) Multiscale Q-learning with linear function approximation Discrete Event Dynamic Systems, 26 (3). pp. 477-509. ISSN 0924-6703

Full text not available from this repository.

Official URL: http://doi.org/10.1007/s10626-015-0216-z

Related URL: http://dx.doi.org/10.1007/s10626-015-0216-z

Abstract

We present in this article a two-timescale variant of Q-learning with linear function approximation. Both Q-values and policies are assumed to be parameterized with the policy parameter updated on a faster timescale as compared to the Q-value parameter. This timescale separation is seen to result in significantly improved numerical performance of the proposed algorithm over Q-learning. We show that the proposed algorithm converges almost surely to a closed connected internally chain transitive invariant set of an associated differential inclusion.

Item Type:Article
Source:Copyright of this article belongs to Springer Nature.
Keywords:Q-Learning With Linear Function Approximation; Reinforcement Learning; Stochastic Approximation; Ordinary Differential Equation; Differential Inclusion; Multi-Stage Stochastic Shortest Path Problem.
ID Code:116496
Deposited On:12 Apr 2021 06:02
Last Modified:12 Apr 2021 06:02

Repository Staff Only: item control page