Borkar, Vivek S. ; Konda, Vijaymohan R. (1997) The actor-critic algorithm as multi-time-scale stochastic approximation Sadhana, 22 (4). pp. 525-543. ISSN 0256-2499
  | 
PDF
 - Publisher Version
 1MB  | 
Official URL: http://www.ias.ac.in/j_archive/sadhana/22/4/525-54...
Related URL: http://dx.doi.org/10.1007/BF02745577
Abstract
The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.
| Item Type: | Article | 
|---|---|
| Source: | Copyright of this article belongs to Indian Academy of Sciences. | 
| Keywords: | Actor-Critic Algorithm; Stochastic Approximation; Markov Decision Processes; Simulation-Based Algorithms; Policy Iteration | 
| ID Code: | 81437 | 
| Deposited On: | 06 Feb 2012 05:01 | 
| Last Modified: | 18 May 2016 22:59 | 
Repository Staff Only: item control page

