A Redundancy-Based Measure of Dissimilarity among Probability Distributions for Hierarchical Clustering Criteria

URI http://harp.lib.hiroshima-u.ac.jp/hiroshima-cu/metadata/6464
ファイル
タイトル
A Redundancy-Based Measure of Dissimilarity among Probability Distributions for Hierarchical Clustering Criteria
著者
氏名 IWATA Kazunori
ヨミ イワタ カズノリ
別名 岩田 一貴
氏名 HAYASHI Akira
ヨミ ハヤシ アキラ
別名 林 朗
キーワード
Clustering
mixture model
dissimilarity measure
information theory
Ward’s method
抄録

We introduce novel dissimilarity into a probabilistic clustering task to properly measure dissimilarity among multiple clusters when each cluster is characterized by a subpopulation in the mixture model. This measure of dissimilarity is called redundancy-based dissimilarity among probability distributions. From aspects of both source coding and a statistical hypothesis test, we shed light on several of the theoretical reasons for the redundancy-based dissimilarity among probability distributions being a reasonable measure of dissimilarity among clusters. We also elucidate a principle in common for the measures of redundancy-based dissimilarity and Ward's method in terms of hierarchical clustering criteria. Moreover, we show several related theorems that are significant for clustering tasks. In the experiments, properties of the measure of redundancy-based dissimilarity are examined in comparison with several other measures.

査読の有無
掲載雑誌名
IEEE Transactions on Pattern Analysis and Machine Intelligence
30
1
開始ページ
76
終了ページ
88
出版年月日
2008-01
出版者
IEEE
ISSN
0162-8828
本文言語
英語
資料タイプ
学術雑誌論文
著者版フラグ
出版社版
権利情報
©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
旧URI
区分
hiroshima-cu