This paper shows how data science can contribute to improving empirical research in economics by leveraging on large datasets and extracting information otherwise unsuitable for a traditional econometric approach. As a test-bed for our framework, machine learning algorithms allow us to create a new holistic measure of innovation built on a 2012 Italian Law aimed at boosting new high-tech firms. We adopt this measure to analyse the impact of innovativeness on a large population of Italian firms which entered the market at the beginning of the 2008 global crisis. The methodological contribution is organised in different steps. First, we train seven supervised learning algorithms to recognise innovative firms on 2013 firmographics data and select a combination of those with best predicting power. Second, we apply the former on the 2008 dataset and predict which firms would have been labelled as innovative according to the definition of the law. Finally, we adopt this new indicator as regressor in a survival model to explain firms' ability to remain in the market after 2008. Results suggest that the group of innovative firms are more likely to survive than the rest of the sample, but the survival premium is likely to depend on location.

The survival of start-ups in time of crisis. A machine learning approach to measure innovation

Massimiliano Nuccio
2019

Abstract

This paper shows how data science can contribute to improving empirical research in economics by leveraging on large datasets and extracting information otherwise unsuitable for a traditional econometric approach. As a test-bed for our framework, machine learning algorithms allow us to create a new holistic measure of innovation built on a 2012 Italian Law aimed at boosting new high-tech firms. We adopt this measure to analyse the impact of innovativeness on a large population of Italian firms which entered the market at the beginning of the 2008 global crisis. The methodological contribution is organised in different steps. First, we train seven supervised learning algorithms to recognise innovative firms on 2013 firmographics data and select a combination of those with best predicting power. Second, we apply the former on the 2008 dataset and predict which firms would have been labelled as innovative according to the definition of the law. Finally, we adopt this new indicator as regressor in a survival model to explain firms' ability to remain in the market after 2008. Results suggest that the group of innovative firms are more likely to survive than the rest of the sample, but the survival premium is likely to depend on location.
File in questo prodotto:
File Dimensione Formato  
1911.01073v1.pdf

accesso aperto

Tipologia: Documento in Pre-print
Licenza: Accesso gratuito (solo visione)
Dimensione 954.53 kB
Formato Adobe PDF
954.53 kB Adobe PDF Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/10278/3722497
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact