Please login first
Residual Value Iteration Algorithm based on Function Approximation
1  Institute of Electronics and Information Engineering, Suzhou University of Science and Technology, Suzhou,Jiangsu
2  Jiangsu Key Laboratory of Intelligent Building Energy Efficiency, Suzhou University of Science and Technology, Suzhou, Jiangsu
3  Suzhou Key Laboratory of Mobile Networking and Applied Technologies, Suzhou University of Science and Technology, Suzhou, Jiangsu

Abstract:

With respect to the problem of unstable and slow convergence for traditional Value Iteration algorithm, we proposed an improved Residual Value Iteration Algorithm based on Function Approximation. The algorithm combines traditional Value Iteration algorithm and Value Iteration algorithm with Bellman residual, introduces weight factors and constructs new rules to update value function parameter vector. Theoretically, the new rule for updating value function parameter vector can guarantee the convergence of the algorithm and solve the unstable convergence problem of the traditional value iteration algorithm. In addition, the algorithm introduces a new factor, named forgotten factor, to speed up the convergence of the algorithm. Applying the proposed algorithm, Value Iteration algorithm and LSPI algorithm to the traditional Grid World problem, the experiment results show that the FARVI algorithm has a good performance and robustness to different scale problems.

Keywords: Reinforcement Learning; Value Iteration; Function Approximation; Gradient Descent; Bellman Residual
Top