Panigrahi, J.R. ; Bhatnagar, S. (2004) Hierarchical decision making in semiconductor fabs using multi-time scale Markov decision processes In: 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601), 14-17 Dec. 2004, Nassau, Bahamas.
Full text not available from this repository.
Official URL: http://doi.org/10.1109/CDC.2004.1429441
Related URL: http://dx.doi.org/10.1109/CDC.2004.1429441
Abstract
There are different timescales of decision making in semiconductor fabs. While decisions on buying/discarding of machines are made on the slower timescale, those that deal with capacity allocation and switchover are made on the faster timescale. We formulate this problem along the lines of a recently developed multi-time scale Markov decision process (MMDP) framework and present numerical experiments wherein we use TD(0) and Q-learning algorithms with linear approximation architecture, and show comparisons of these with the policy iteration algorithm. We show numerical experiments under two different scenarios. In the first, transition probabilities are computed and used in the algorithms. In the second, transitions are simulated without explicitly computing the transition probabilities. We observe that TD(0) requires less computation than Q-learning. Moreover algorithms that use simulated transitions require significantly less computation than their counterparts that compute transition probabilities.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Source: | Copyright of this article belongs to Institute of Electrical and Electronics Engineers. |
ID Code: | 116741 |
Deposited On: | 12 Apr 2021 07:31 |
Last Modified: | 12 Apr 2021 07:31 |
Repository Staff Only: item control page