Bhatnagar, Shalabh ; Lakshmanan, K. (2016) Multiscale Q-learning with linear function approximation Discrete Event Dynamic Systems, 26 (3). pp. 477-509. ISSN 0924-6703
Full text not available from this repository.
Official URL: http://doi.org/10.1007/s10626-015-0216-z
Related URL: http://dx.doi.org/10.1007/s10626-015-0216-z
Abstract
We present in this article a two-timescale variant of Q-learning with linear function approximation. Both Q-values and policies are assumed to be parameterized with the policy parameter updated on a faster timescale as compared to the Q-value parameter. This timescale separation is seen to result in significantly improved numerical performance of the proposed algorithm over Q-learning. We show that the proposed algorithm converges almost surely to a closed connected internally chain transitive invariant set of an associated differential inclusion.
Item Type: | Article |
---|---|
Source: | Copyright of this article belongs to Springer Nature. |
Keywords: | Q-Learning With Linear Function Approximation; Reinforcement Learning; Stochastic Approximation; Ordinary Differential Equation; Differential Inclusion; Multi-Stage Stochastic Shortest Path Problem. |
ID Code: | 116496 |
Deposited On: | 12 Apr 2021 06:02 |
Last Modified: | 12 Apr 2021 06:02 |
Repository Staff Only: item control page