Karmakar, Prasenjit ; Bhatnagar, Shalabh (2021) On tight bounds for function approximation error in risk-sensitive reinforcement learning Systems & Control Letters, 150 . p. 104899. ISSN 0167-6911
Full text not available from this repository.
Official URL: http://doi.org/10.1016/j.sysconle.2021.104899
Related URL: http://dx.doi.org/10.1016/j.sysconle.2021.104899
Abstract
In this letter we provide several informative tight error bounds when using value function approximators for the risk-sensitive cost setting for a given policy represented using exponential utility. The novelty of our approach is that we make use of the irreducibility of the underlying Markov chain (resulting in better bounds using Perron–Frobenius eigenvectors) to derive new bounds whereas the earlier work used primarily the spectral variation bound which holds for any matrix, hence did not make use of the irreducibility. All our bounds have a perturbation term for large state spaces. We also present examples where we show that the new bounds perform 90-100% better than the earlier proposed spectral variation bound.
Item Type: | Article |
---|---|
Source: | Copyright of this article belongs to Elsevier B.V. |
Keywords: | Risk-Sensitive Reinforcement Learning; Perron–Frobenius Eigenvalue; Stochastic Systems; Stochastic Optimal Control; Eigenvalue Perturbation; Function Approximation. |
ID Code: | 116413 |
Deposited On: | 12 Apr 2021 05:49 |
Last Modified: | 12 Apr 2021 05:49 |
Repository Staff Only: item control page