Accelerated structured matrix factorization

Matrix factorization exploits the idea that, in complex high-dimensional data, the actual signal typically lies in lower-dimensional structures. These lower dimensional objects provide useful insight, with interpretation favored by sparse structures. Sparsity, in addition, is beneficial in terms of regularization and, thus, to avoid over-fitting. By exploiting Bayesian shrinkage priors, we devise a computationally convenient approach for high-dimensional matrix factorization. The dependence between row and column entities is modeled by inducing flexible sparse patterns within factors. The availability of external information is accounted for in such a way that structures are allowed while not imposed. Inspired by boosting algorithms, we pair the the proposed approach with a numerical strategy relying on a sequential inclusion and estimation of low-rank contributions, with a data-driven stopping rule. Practical advantages of the proposed approach are demonstrated by means of a simulation study and the analysis of soccer heatmaps obtained from new generation tracking data. Supplemental materials are available online.

Accelerated structured matrix factorization

Lorenzo Schiavon;Bernardo Nipoti;Antonio Canale

2023-01-01

Abstract

Matrix factorization exploits the idea that, in complex high-dimensional data, the actual signal typically lies in lower-dimensional structures. These lower dimensional objects provide useful insight, with interpretation favored by sparse structures. Sparsity, in addition, is beneficial in terms of regularization and, thus, to avoid over-fitting. By exploiting Bayesian shrinkage priors, we devise a computationally convenient approach for high-dimensional matrix factorization. The dependence between row and column entities is modeled by inducing flexible sparse patterns within factors. The availability of external information is accounted for in such a way that structures are allowed while not imposed. Inspired by boosting algorithms, we pair the the proposed approach with a numerical strategy relying on a sequential inclusion and estimation of low-rank contributions, with a data-driven stopping rule. Practical advantages of the proposed approach are demonstrated by means of a simulation study and the analysis of soccer heatmaps obtained from new generation tracking data. Supplemental materials are available online.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2023
			
	Titolo della Rivista
	
				JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
			
	N° Volume
	
				33
			
	DOI
	
				https://dx.doi.org/10.1080/10618600.2023.2301072
			
	Appare nelle tipologie:
	
				2.1 Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
2024 - JCGS - Schiavon,Nipoti,Canale (3).pdf non disponibili Tipologia: Versione dell'editore Licenza: Copyright dell'editore Dimensione 2.16 MB Formato Adobe PDF Visualizza/Apri	2.16 MB	Adobe PDF	Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5046026

Citazioni

ND

0

1

social impact