The actor-critic algorithm as multi-time-scale stochastic approximation

Borkar, Vivek S. ; Konda, Vijaymohan R. (1997) The actor-critic algorithm as multi-time-scale stochastic approximation Sadhana (Academy Proceedings in Engineering Sciences), 22 (4). pp. 525-543. ISSN 0256-2499

[img]
Preview
PDF - Publisher Version
1MB

Official URL: http://www.ias.ac.in/j_archive/sadhana/22/4/525-54...

Related URL: http://dx.doi.org/10.1007/BF02745577

Abstract

The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.

Item Type:Article
Source:Copyright of this article belongs to Indian Academy of Sciences.
Keywords:Actor-critic Algorithm; Stochastic Approximation; Markov Decision Processes; Simulation-based Algorithms; Policy Iteration
ID Code:5376
Deposited On:18 Oct 2010 09:05
Last Modified:16 May 2016 15:53

Repository Staff Only: item control page