Borkar, V. S. (2000) A learning algorithm for discrete-time stochastic control Probability in the Engineering and Informational Sciences, 14 (2). pp. 243-258. ISSN 0269-9648
Full text not available from this repository.
Official URL: http://portal.acm.org/citation.cfm?id=984613.98462...
Related URL: http://dx.doi.org/10.1017/S0269964800142081
Abstract
A simulation-based algorithm for learning good policies for a discrete-time stochastic control process with unknown transition law is analyzed when the state and action spaces are compact subsets of Euclidean spaces. This extends the Q-learning scheme of discrete state/action problems along the lines of Baker [4]. Almost sure convergence is proved under suitable conditions.
Item Type: | Article |
---|---|
Source: | Copyright of this article belongs to Cambridge University Press. |
ID Code: | 5317 |
Deposited On: | 18 Oct 2010 08:38 |
Last Modified: | 20 May 2011 09:07 |
Repository Staff Only: item control page