Thathachar, M. A. L. ; Phansalkar, V. V. (1995) Learning the global maximum with parameterized learning automata IEEE Transactions on Neural Networks, 6 (2). pp. 398-406. ISSN 1045-9227
Full text not available from this repository.
Official URL: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumb...
Related URL: http://dx.doi.org/10.1109/72.363475
Abstract
A feedforward network composed of units of teams of parameterized learning automata is considered as a model of a reinforcement learning system. The internal state vector of each learning automaton is updated using an algorithm consisting of a gradient-following term and a random perturbation term. It is shown that the algorithm weakly converges to a solution of the Langevin equation, implying that the algorithm globally maximizes an appropriate function. The algorithm is decentralized, and the units do not have any information exchange during updating. Simulation results on common payoff games and pattern recognition problems show that reasonable rates of convergence can be obtained.
Item Type: | Article |
---|---|
Source: | Copyright of this article belongs to IEEE. |
ID Code: | 51329 |
Deposited On: | 28 Jul 2011 15:01 |
Last Modified: | 28 Jul 2011 15:01 |
Repository Staff Only: item control page