Yao, Hengshuai ; Bhatnagar, Shalabh ; Szepesv´ari, Csaba (2009) Temporal Difference Learning by Direct Preconditioning In: Multidisciplinary Symposium on Reinforcement Learning (MSRL), June 18-19, 2009, Montreal, Canada.
Full text not available from this repository.
Official URL: http://msrl09.rl-community.org/
Abstract
We propose a new class of algorithms that directly precondition the TD update. We then focus on a new preconditioned algorithm and prove its convergence. Empirical results on the new algorithm shall be presented in a detailed version of this paper.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Source: | Copyright 2009 by the author(s)/owner(s). |
ID Code: | 116709 |
Deposited On: | 12 Apr 2021 07:25 |
Last Modified: | 12 Apr 2021 07:25 |
Repository Staff Only: item control page