A Tractable Algorithm for Finite-Horizon Continuous Reinforcement Learning

Gampa, Phanideep and Kondamudi, Sairam Satwik and Kailasam, Lakshmanan (2019) A Tractable Algorithm for Finite-Horizon Continuous Reinforcement Learning. In: 2nd International Conference on Intelligent Autonomous Systems, ICoIAS, 28 February-2 March 2019, Singapore.

Full text not available from this repository. (Request a copy)

Abstract

We consider the finite horizon continuous reinforcement learning problem. Our contribution is three-fold. First,we give a tractable algorithm based on optimistic value iteration for the problem. Next,we give a lower bound on regret of order Ω(T2/3) for any algorithm discretizes the state space, improving the previous regret bound of Ω(T1/2) of Ortner and Ryabko [1] for the same problem. Next,under the assumption that the rewards and transitions are Hölder Continuous we show that the upper bound on the discretization error is const.Ln-α T. Finally, we give some simple experiments to validate our propositions.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Bonus, Continuous State Space, Finite Horizon, Markov Decision Process(MDP), Regret, Reinforcement Learning
Subjects: Computer science
Divisions: Department of Computer Science & Engineering
Depositing User: Team Library
Date Deposited: 05 Sep 2019 08:59
Last Modified: 05 Sep 2019 08:59
URI: http://raiith.iith.ac.in/id/eprint/6123
Publisher URL: http://doi.org/10.1109/ICoIAS.2019.00018
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 6123 Statistics for this ePrint Item