The nonparametric formulation of density-based clustering, known as modal cluster- ing, draws a correspondence between groups and the attraction domains of the modes of the density function underlying the data. Its probabilistic foundation allows for a natural, yet not trivial, generalization of the approach to the matrix-valued setting, increasingly widespread, for example, in longitudinal and multivariate spatio-temporal studies. In this work we introduce nonparametric estimators of matrix-variate distribu- tions based on kernel methods, and analyze their asymptotic properties. Additionally, we propose a generalization of the mean-shift procedure for the identification of the modes of the estimated density. Given the intrinsic high dimensionality of matrix- variate data, we discuss some locally adaptive solutions to handle the problem. We test the procedure via extensive simulations, also with respect to some competitors, and illustrate its performance through two high-dimensional real data applications.
Modal clustering of matrix-variate data
Federico Ferraccioli;Giovanna Menardi
2023-01-01
Abstract
The nonparametric formulation of density-based clustering, known as modal cluster- ing, draws a correspondence between groups and the attraction domains of the modes of the density function underlying the data. Its probabilistic foundation allows for a natural, yet not trivial, generalization of the approach to the matrix-valued setting, increasingly widespread, for example, in longitudinal and multivariate spatio-temporal studies. In this work we introduce nonparametric estimators of matrix-variate distribu- tions based on kernel methods, and analyze their asymptotic properties. Additionally, we propose a generalization of the mean-shift procedure for the identification of the modes of the estimated density. Given the intrinsic high dimensionality of matrix- variate data, we discuss some locally adaptive solutions to handle the problem. We test the procedure via extensive simulations, also with respect to some competitors, and illustrate its performance through two high-dimensional real data applications.File | Dimensione | Formato | |
---|---|---|---|
FM_ADAC2022.pdf
non disponibili
Tipologia:
Versione dell'editore
Licenza:
Accesso chiuso-personale
Dimensione
4.72 MB
Formato
Adobe PDF
|
4.72 MB | Adobe PDF | Visualizza/Apri |
I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.