A wide class of Bayesian nonparametric priors leads to the representation of the distribution of the observable variables as a mixture density with an infinite number of components. Such a representation induces a clustering structure in the data. However, due to label switching, cluster identification is not straightforward a posteriori and some post-processing of the MCMC output is usually required. Alternatively, observations can be mapped on a weighted undirected graph, where each node represents a sample item and edge weights are given by the posterior pairwise similarities. It is shown how, after building a particular random walk on such a graph, it is possible to apply a community detection algorithm, known as map equation, leading to the minimisation of the expected description length of the partition. A relevant feature of this method is that it allows for the quantification of the posterior uncertainty of the classification.
Bayesian nonparametric clustering as a community detection problem
Tonellato S. F.
2020-01-01
Abstract
A wide class of Bayesian nonparametric priors leads to the representation of the distribution of the observable variables as a mixture density with an infinite number of components. Such a representation induces a clustering structure in the data. However, due to label switching, cluster identification is not straightforward a posteriori and some post-processing of the MCMC output is usually required. Alternatively, observations can be mapped on a weighted undirected graph, where each node represents a sample item and edge weights are given by the posterior pairwise similarities. It is shown how, after building a particular random walk on such a graph, it is possible to apply a community detection algorithm, known as map equation, leading to the minimisation of the expected description length of the partition. A relevant feature of this method is that it allows for the quantification of the posterior uncertainty of the classification.File | Dimensione | Formato | |
---|---|---|---|
1-s2.0-S0167947320301353-main.pdf
non disponibili
Descrizione: Articolo principale
Tipologia:
Versione dell'editore
Licenza:
Accesso chiuso-personale
Dimensione
1.09 MB
Formato
Adobe PDF
|
1.09 MB | Adobe PDF | Visualizza/Apri |
I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.