Identification and adaptive control of Markov chains

Borkar, Vivek ; Varaiya, Pravin (1982) Identification and adaptive control of Markov chains SIAM Journal on Control and Optimization, 20 (4). pp. 470-489. ISSN 0363-0129

Full text not available from this repository.

Official URL: http://link.aip.org/link/?SJCODC/20/470/1

Abstract

Consider a countable state controlled Markov chain whose transition probability is specified up to an unknown parameter alpha taking values in a compact metric space A. To each alpha is associated a prespecified stationary control law zeta ( alpha ). The adaptive control law selects at each time t the control action zeta ( alpha sub(t), x sub(t)) where x sub(t) is the state and alpha sub(t) is the maximum likelihood estimate of alpha . The asymptotic behavior of this control scheme is investigated for the cases when the true parameter value alpha sub(0) does or does not belong to A, and for the case when zeta is chosen to minimize an average cost criterion. The analysis uses an appropriate extension of the notions of recurrence to nonstationary Markov chains.

Item Type:Article
Source:Copyright of this article belongs to Society for Industrial and Applied Mathematics.
ID Code:81412
Deposited On:06 Feb 2012 04:26
Last Modified:06 Feb 2012 04:26

Repository Staff Only: item control page