The actor-critic algorithm as multi-time-scale stochastic approximation

Borkar, Vivek S. ; Konda, Vijaymohan R. (1997) The actor-critic algorithm as multi-time-scale stochastic approximation Sadhana, 22 (4). pp. 525-543. ISSN 0256-2499

[img]
Preview
PDF - Publisher Version
1MB

Official URL: http://www.ias.ac.in/j_archive/sadhana/22/4/525-54...

Related URL: http://dx.doi.org/10.1007/BF02745577

Abstract

The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.

Item Type:Article
Source:Copyright of this article belongs to Indian Academy of Sciences.
Keywords:Actor-Critic Algorithm; Stochastic Approximation; Markov Decision Processes; Simulation-Based Algorithms; Policy Iteration
ID Code:81437
Deposited On:06 Feb 2012 05:01
Last Modified:18 May 2016 22:59

Repository Staff Only: item control page