Thathachar, M. A. L. ; Ramachandran, K. M. (1984) Asymptotic behaviour of a learning algorithm International Journal of Control, 39 (4). pp. 827-838. ISSN 0020-7179
Full text not available from this repository.
Official URL: http://www.tandfonline.com/doi/abs/10.1080/0020717...
Related URL: http://dx.doi.org/10.1080/00207178408933209
Abstract
The paper considers a learning automaton operating in a stationary random environment. The automaton has multiple actions and updates its action probability vector according to the linear reward-ε penalty (LR-εp) algorithm. Using weak convergence concepts it is shown that for large time and small values of parameters in the algorithm, the evolution of the action probability can be represented by Gauss-Markov diffusion.
Item Type: | Article |
---|---|
Source: | Copyright of this article belongs to Taylor and Francis Group. |
ID Code: | 51369 |
Deposited On: | 28 Jul 2011 11:56 |
Last Modified: | 28 Jul 2011 11:56 |
Repository Staff Only: item control page