Borkar, Vivek S. ; Konda, Vijaymohan R. (1997) The actor-critic algorithm as multi-time-scale stochastic approximation Sadhana (Academy Proceedings in Engineering Sciences), 22 (4). pp. 525-543. ISSN 0256-2499
|
PDF
- Publisher Version
1MB |
Official URL: http://www.ias.ac.in/j_archive/sadhana/22/4/525-54...
Related URL: http://dx.doi.org/10.1007/BF02745577
Abstract
The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.
Item Type: | Article |
---|---|
Source: | Copyright of this article belongs to Indian Academy of Sciences. |
Keywords: | Actor-critic Algorithm; Stochastic Approximation; Markov Decision Processes; Simulation-based Algorithms; Policy Iteration |
ID Code: | 5376 |
Deposited On: | 18 Oct 2010 09:05 |
Last Modified: | 16 May 2016 15:53 |
Repository Staff Only: item control page