A Redundancy-Based Measure of Dissimilarity among Probability Distributions for Hierarchical Clustering Criteria

URI http://harp.lib.hiroshima-u.ac.jp/hiroshima-cu/metadata/6464
File
Title
A Redundancy-Based Measure of Dissimilarity among Probability Distributions for Hierarchical Clustering Criteria
Author
氏名 IWATA Kazunori
ヨミ イワタ カズノリ
別名 岩田 一貴
氏名 HAYASHI Akira
ヨミ ハヤシ アキラ
別名 林 朗
Subject
Clustering
mixture model
dissimilarity measure
information theory
Ward’s method
Abstract

We introduce novel dissimilarity into a probabilistic clustering task to properly measure dissimilarity among multiple clusters when each cluster is characterized by a subpopulation in the mixture model. This measure of dissimilarity is called redundancy-based dissimilarity among probability distributions. From aspects of both source coding and a statistical hypothesis test, we shed light on several of the theoretical reasons for the redundancy-based dissimilarity among probability distributions being a reasonable measure of dissimilarity among clusters. We also elucidate a principle in common for the measures of redundancy-based dissimilarity and Ward's method in terms of hierarchical clustering criteria. Moreover, we show several related theorems that are significant for clustering tasks. In the experiments, properties of the measure of redundancy-based dissimilarity are examined in comparison with several other measures.

Description Peer Reviewed
Journal Title
IEEE Transactions on Pattern Analysis and Machine Intelligence
Volume
30
Issue
1
Spage
76
Epage
88
Published Date
2008-01
Publisher
IEEE
ISSN
0162-8828
Language
eng
NIIType
Journal Article
Text Version
出版社版
Rights
©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Old URI
Set
hiroshima-cu