Parameter estimation of generalized linear models with crossed random effects for large-scale settings is hampered by challenging numerical hindrances. This contribution focuses on logistic regression with crossed-random intercepts and it investigates the properties of two estimation methods for which a scalable software implementation exists, namely the all-row-column and penalized quasi- likelihood methods. The results of a simulation study for sparse settings inspired by e-commerce data, with sample sizes up to 10^6, suggest that the all-row-column method is preferable over penalized quasi-likelihood.

A comparison of scalable estimation methods for large-scale logistic regression models with crossed random effects

Cristiano Varin
2024-01-01

Abstract

Parameter estimation of generalized linear models with crossed random effects for large-scale settings is hampered by challenging numerical hindrances. This contribution focuses on logistic regression with crossed-random intercepts and it investigates the properties of two estimation methods for which a scalable software implementation exists, namely the all-row-column and penalized quasi- likelihood methods. The results of a simulation study for sparse settings inspired by e-commerce data, with sample sizes up to 10^6, suggest that the all-row-column method is preferable over penalized quasi-likelihood.
2024
Proceedings of the Statistics and Data Science 2024 Conference
File in questo prodotto:
File Dimensione Formato  
bellio-varin-proceedings2.pdf

accesso aperto

Tipologia: Documento in Pre-print
Licenza: Accesso libero (no vincoli)
Dimensione 265.08 kB
Formato Adobe PDF
265.08 kB Adobe PDF Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5082302
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact