A model based search method for prediction in model-free Markov decision process

Joseph, Ajin George ; Bhatnagar, Shalabh (2017) A model based search method for prediction in model-free Markov decision process In: International Joint Conference on Neural Networks (IJCNN), 14-19 May 2017, Anchorage, AK.

Full text not available from this repository.

Official URL: http://doi.org/10.1109/IJCNN.2017.7965851

Related URL: http://dx.doi.org/10.1109/IJCNN.2017.7965851

Abstract

In this paper, we provide a new algorithm for the problem of prediction in the model-free MDP setting, i.e., estimating the value function of a given policy using the linear function approximation architecture, with memory and computation costs scaling quadratically in the size of the feature set. The algorithm is a multi-timescale variant of the very popular cross entropy (CE) method which is a model based search method to find the global optimum of a real-valued function. This is the first time a model based search method is used for the prediction problem. A proof of convergence using the ODE method is provided. The theoretical results are supplemented with experimental comparisons. The algorithm achieves good performance fairly consistently on many benchmark problems.

Item Type:Conference or Workshop Item (Paper)
Source:Copyright of this article belongs to Institute of Electrical and Electronics Engineers.
ID Code:116645
Deposited On:12 Apr 2021 07:17
Last Modified:12 Apr 2021 07:17

Repository Staff Only: item control page