Currently many research problems are addressed by analysing datasets characterized by a huge number of variables, with a relatively limited number of observations, especially when data are generated by experimentation. Most of the classical statistical procedures for regression analysis are often inadequate to deal with such data set as they have been developed assuming that the number of observations is larger than the number of the variables. In this work, we propose a new penalization procedure for variable selection in regression models based on Bootstrap group Penalties (BgP). This new family of penalization methods extends the bootstrap version of the LASSO approach by taking into account the grouping structure that may be present or introduced in the model. We develop a simulation study to compare the performance of this new approach with respect several existing group penalization methods in terms of both prediction accuracy and variable selection quality. The results achieved in this study show that the new procedure outperforms the other penalties procedures considered.

Currently many research problems are addressed by analysing datasets characterized by a huge number of variables, with a relatively limited number of observations, especially when data are generated by experimentation. Most of the classical statistical procedures for regression analysis are often inadequate to deal with such datasets as they have been developed assuming that the number of observations is larger than the number of the variables. In this work, we propose a new penalization procedure for variable selection in regression models based on Bootstrap group Penalties (BgP). This new family of penalization methods extends the bootstrap version of the LASSO approach by taking into account the grouping structure that may be present or introduced in the model. We develop a simulation study to compare the performance of this new approach with respect several existing group penalization methods in terms of both prediction accuracy and variable selection quality. The results achieved in this study show that the new procedure outperforms the other penalties procedures considered.

Estimating High-Dimensional Regression Models with Bootstrap group Penalties

Mameli, V.;Slanzi, D.;Poli, I.
2019

Abstract

Currently many research problems are addressed by analysing datasets characterized by a huge number of variables, with a relatively limited number of observations, especially when data are generated by experimentation. Most of the classical statistical procedures for regression analysis are often inadequate to deal with such data set as they have been developed assuming that the number of observations is larger than the number of the variables. In this work, we propose a new penalization procedure for variable selection in regression models based on Bootstrap group Penalties (BgP). This new family of penalization methods extends the bootstrap version of the LASSO approach by taking into account the grouping structure that may be present or introduced in the model. We develop a simulation study to compare the performance of this new approach with respect several existing group penalization methods in terms of both prediction accuracy and variable selection quality. The results achieved in this study show that the new procedure outperforms the other penalties procedures considered.
New Statistical Developments in Data Science
File in questo prodotto:
File Dimensione Formato  
sis2017 first revision_mameli.pdf

non disponibili

Descrizione: Articolo principale
Tipologia: Documento in Post-print
Licenza: Accesso chiuso-personale
Dimensione 143.25 kB
Formato Adobe PDF
143.25 kB Adobe PDF   Visualizza/Apri
Mameli Slanzi Poli Springer 2019.pdf

non disponibili

Descrizione: Articolo principale versione editore
Tipologia: Versione dell'editore
Licenza: Accesso chiuso-personale
Dimensione 3.14 MB
Formato Adobe PDF
3.14 MB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/10278/3708952
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact