BORKAR, VIVEK S ; JAIN, ANKUSH V (2018) Reinforcement learning, Sequential Monte Carlo and the EM algorithm Sadhana (Academy Proceedings in Engineering Sciences), 43 (8). ISSN 0256-2499
Full text not available from this repository.
Official URL: http://doi.org/10.1007/s12046-018-0889-8
Related URL: http://dx.doi.org/10.1007/s12046-018-0889-8
Abstract
Using the expression for the unnormalized nonlinear filter for a hidden Markov model, we develop a dynamic-programming-like backward recursion for the filter. This is combined with some ideas from reinforcement learning and a conditional version of importance sampling in order to develop a scheme based on stochastic approximation for estimating the desired conditional expectation. This is then extended to a smoothing problem. Applying these ideas to the EM algorithm, a reinforcement learning scheme is developed for estimating the partially observed log-likelihood function. A stochastic approximation scheme maximizes this function over the unknown parameter. The two procedures are performed on two different time scales, emulating the alternating ‘expectation’ and ‘maximization’ operations of the EM algorithm. We also extend this to a continuous state space problem. Numerical results are presented in support of our schemes.
Item Type: | Article |
---|---|
Source: | Copyright of this article belongs to Indian Academy of Sciences. |
ID Code: | 135153 |
Deposited On: | 19 Jan 2023 10:13 |
Last Modified: | 19 Jan 2023 10:13 |
Repository Staff Only: item control page