Borkar, Vivek S. ; Konda, Vijaymohan R. (1997) The actor-critic algorithm as multi-time-scale stochastic approximation Sadhana, 22 (4). pp. 525-543. ISSN 0256-2499
|
PDF
- Publisher Version
1MB |
Official URL: http://www.ias.ac.in/j_archive/sadhana/22/4/525-54...
Related URL: http://dx.doi.org/10.1007/BF02745577
Abstract
The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.
Item Type: | Article |
---|---|
Source: | Copyright of this article belongs to Indian Academy of Sciences. |
Keywords: | Actor-Critic Algorithm; Stochastic Approximation; Markov Decision Processes; Simulation-Based Algorithms; Policy Iteration |
ID Code: | 81437 |
Deposited On: | 06 Feb 2012 05:01 |
Last Modified: | 18 May 2016 22:59 |
Repository Staff Only: item control page