An Information-Theoretic Analysis of Return Maximization in Reinforcement Learning

URI http://harp.lib.hiroshima-u.ac.jp/hiroshima-cu/metadata/12385
ファイル
タイトル
An Information-Theoretic Analysis of Return Maximization in Reinforcement Learning
著者
氏名 IWATA Kazunori
ヨミ イワタ カズノリ
別名 岩田 一貴
キーワード
Reinforcement learning
Stochastic sequential decision process
Information theory
Asymptotic equipartition property
抄録

We present a general analysis of return maximization in reinforcement learning. This analysis does not require assumptions of Markovianity, stationarity, and ergodicity for the stochastic sequential decision processes of reinforcement learning. Instead, our analysis assumes the asymptotic equipartition property fundamental to information theory, providing a substantially different view from that in the literature. As our main results, we show that return maximization is achieved by the overlap of typical and best sequence sets, and we present a class of stochastic sequential decision processes with the necessary condition for return maximization. We also describe several examples of best sequences in terms of return maximization in the class of stochastic sequential decision processes, which satisfy the necessary condition.

査読の有無
掲載雑誌名
Neural Networks
24
10
開始ページ
1074
終了ページ
1081
出版年月日
2011-12
出版者
Elsevier
ISSN
08936080
NCID
AA10680676
AA11540311
DOI
10.1016/j.neunet.2011.05.002
本文言語
英語
資料タイプ
学術雑誌論文
著者版フラグ
著者版
権利情報
Copyright © 2011 Elsevier Ltd. All rights reserved
This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
関連URL
旧URI
区分
hiroshima-cu