Extending the Peak Bandwidth of Parameters for Softmax Selection in Reinforcement Learning

URI http://harp.lib.hiroshima-u.ac.jp/hiroshima-cu/metadata/12387
File
Title
Extending the Peak Bandwidth of Parameters for Softmax Selection in Reinforcement Learning
Author
氏名 IWATA Kazunori
ヨミ イワタ カズノリ
別名 岩田 一貴
Subject
Asymptotic equipartition property (AEP)
parameter bandwidth
reinforcement learning (RL)
softmax selection
Abstract

Softmax selection is one of the most popular
methods for action selection in reinforcement learning. Although
various recently proposed methods may be more effective with
full parameter tuning, implementing a complicated method
that requires the tuning of many parameters can be difficult.
Thus, softmax selection is still worth revisiting, considering the
cost savings of its implementation and tuning. In fact, this
method works adequately in practice with only one parameter
appropriately set for the environment. The aim of this paper
is to improve the variable setting of this method to extend
the bandwidth of good parameters, thereby reducing the cost
of implementation and parameter tuning. To achieve this, we
take advantage of the asymptotic equipartition property in a
Markov decision process to extend the peak bandwidth of softmax
selection. Using a variety of episodic tasks, we show that our
setting is effective in extending the bandwidth and that it yields a
better policy in terms of stability. The bandwidth is quantitatively
assessed in a series of statistical tests.

Description Peer Reviewed
Journal Title
IEEE Transactions on Neural Networks and Learning Systems
Volume
28
Issue
8
Spage
1865
Epage
1877
Published Date
2016-05-11
Publisher
IEEE
ISSN
2162237X
NCID
AA1255553X
DOI
10.1109/TNNLS.2016.2558295
PubMed ID
27187974
Language
eng
NIIType
Journal Article
Text Version
著者版
Rights
© 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
Relation URL
Old URI
Set
hiroshima-cu