Research in innovation studies usually relies on financial statements, surveys, or patents as primary data sources, although these sources of information show some limitations when applied to Small and Medium Enterprises (SMEs). Our paper explores whether the HTML code of a company’s website is a further source to better inform innovation policies, under the assumption that how HTML is employed in crafting a corporate website provides insights into the company’s innovation capabilities. In particular, we leverage HTML tags and their associations to empirically show that the websites of innovative SMEs are different from non-innovative ones both in terms of their size and coding practices. Our findings, based on a sample of Italian companies, indicate that the features of the HTML code of corporate websites reflect unobservable characteristics related to the skills and creativity present in businesses.
Scraping innovativeness from corporate websites: Empirical evidence on Italian manufacturing SMEs
Crosato, Lisa;
2024-01-01
Abstract
Research in innovation studies usually relies on financial statements, surveys, or patents as primary data sources, although these sources of information show some limitations when applied to Small and Medium Enterprises (SMEs). Our paper explores whether the HTML code of a company’s website is a further source to better inform innovation policies, under the assumption that how HTML is employed in crafting a corporate website provides insights into the company’s innovation capabilities. In particular, we leverage HTML tags and their associations to empirically show that the websites of innovative SMEs are different from non-innovative ones both in terms of their size and coding practices. Our findings, based on a sample of Italian companies, indicate that the features of the HTML code of corporate websites reflect unobservable characteristics related to the skills and creativity present in businesses.File | Dimensione | Formato | |
---|---|---|---|
postprint TFSC.pdf
embargo fino al 05/08/2026
Tipologia:
Documento in Post-print
Licenza:
Accesso gratuito (solo visione)
Dimensione
9.38 MB
Formato
Adobe PDF
|
9.38 MB | Adobe PDF | Visualizza/Apri |
I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.