The nonparametric formulation of density-based clustering, known as modal cluster- ing, draws a correspondence between groups and the attraction domains of the modes of the density function underlying the data. Its probabilistic foundation allows for a natural, yet not trivial, generalization of the approach to the matrix-valued setting, increasingly widespread, for example, in longitudinal and multivariate spatio-temporal studies. In this work we introduce nonparametric estimators of matrix-variate distribu- tions based on kernel methods, and analyze their asymptotic properties. Additionally, we propose a generalization of the mean-shift procedure for the identification of the modes of the estimated density. Given the intrinsic high dimensionality of matrix- variate data, we discuss some locally adaptive solutions to handle the problem. We test the procedure via extensive simulations, also with respect to some competitors, and illustrate its performance through two high-dimensional real data applications.

Modal clustering of matrix-variate data

Federico Ferraccioli;Giovanna Menardi
2023-01-01

Abstract

The nonparametric formulation of density-based clustering, known as modal cluster- ing, draws a correspondence between groups and the attraction domains of the modes of the density function underlying the data. Its probabilistic foundation allows for a natural, yet not trivial, generalization of the approach to the matrix-valued setting, increasingly widespread, for example, in longitudinal and multivariate spatio-temporal studies. In this work we introduce nonparametric estimators of matrix-variate distribu- tions based on kernel methods, and analyze their asymptotic properties. Additionally, we propose a generalization of the mean-shift procedure for the identification of the modes of the estimated density. Given the intrinsic high dimensionality of matrix- variate data, we discuss some locally adaptive solutions to handle the problem. We test the procedure via extensive simulations, also with respect to some competitors, and illustrate its performance through two high-dimensional real data applications.
File in questo prodotto:
File Dimensione Formato  
FM_ADAC2022.pdf

non disponibili

Tipologia: Versione dell'editore
Licenza: Accesso chiuso-personale
Dimensione 4.72 MB
Formato Adobe PDF
4.72 MB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5082743
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact