An Information-Theoretic Analysis of Return Maximization in Reinforcement Learning

URI http://harp.lib.hiroshima-u.ac.jp/hiroshima-cu/metadata/12385
File
Title
An Information-Theoretic Analysis of Return Maximization in Reinforcement Learning
Author
氏名 IWATA Kazunori
ヨミ イワタ カズノリ
別名 岩田 一貴
Subject
Reinforcement learning
Stochastic sequential decision process
Information theory
Asymptotic equipartition property
Abstract

We present a general analysis of return maximization in reinforcement learning. This analysis does not require assumptions of Markovianity, stationarity, and ergodicity for the stochastic sequential decision processes of reinforcement learning. Instead, our analysis assumes the asymptotic equipartition property fundamental to information theory, providing a substantially different view from that in the literature. As our main results, we show that return maximization is achieved by the overlap of typical and best sequence sets, and we present a class of stochastic sequential decision processes with the necessary condition for return maximization. We also describe several examples of best sequences in terms of return maximization in the class of stochastic sequential decision processes, which satisfy the necessary condition.

Description Peer Reviewed
Journal Title
Neural Networks
Volume
24
Issue
10
Spage
1074
Epage
1081
Published Date
2011-12
Publisher
Elsevier
ISSN
08936080
NCID
AA10680676
AA11540311
DOI
10.1016/j.neunet.2011.05.002
Language
eng
NIIType
Journal Article
Text Version
著者版
Rights
Copyright © 2011 Elsevier Ltd. All rights reserved
This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
Relation URL
Old URI
Set
hiroshima-cu