The paper is a first, preliminary attempt to illustrate the potentialities of topic modeling as information retrieval system helping to reduce problems of overload information in the sciences, and economics in particular. Noting that some motives for the use of automated tools as information retrieval systems in economics have to do with the changing structure of the discipline itself, we argue that the standard classification system in economics developed over a hundred years ago by the American Economics Association, the Journal of Economic Literature (JEL) codes, can easily assist in detecting the major faults of unsupervised techniques and possibly provide suggestions about how to correct them. With this aim in mind, we apply to the corpus of (some 1500) “exemplary” documents for each classification of the Journal of Economics Literature Codes indicated by the American Economics Association in the “JEL codes guide” (https://www.aeaweb.org/jel/guide/jel.php) the topic-modeling technique known as Latent Dirichlet Allocation (LDA), which serves to discover the hidden (latent) thematic structure in large archives of documents, by detecting probabilistic regularities, that is trends in language text and recurring themes in the form of co-occurring words. The ambition is to propose and interpret measures of (dis)similarity between JEL codes and the LDA topics resulting from the analysis.

The Visible Map and the Hidden Structure of Economics. Stress-testing the JEL Classification System

Massimiliano Nuccio
2018

Abstract

The paper is a first, preliminary attempt to illustrate the potentialities of topic modeling as information retrieval system helping to reduce problems of overload information in the sciences, and economics in particular. Noting that some motives for the use of automated tools as information retrieval systems in economics have to do with the changing structure of the discipline itself, we argue that the standard classification system in economics developed over a hundred years ago by the American Economics Association, the Journal of Economic Literature (JEL) codes, can easily assist in detecting the major faults of unsupervised techniques and possibly provide suggestions about how to correct them. With this aim in mind, we apply to the corpus of (some 1500) “exemplary” documents for each classification of the Journal of Economics Literature Codes indicated by the American Economics Association in the “JEL codes guide” (https://www.aeaweb.org/jel/guide/jel.php) the topic-modeling technique known as Latent Dirichlet Allocation (LDA), which serves to discover the hidden (latent) thematic structure in large archives of documents, by detecting probabilistic regularities, that is trends in language text and recurring themes in the form of co-occurring words. The ambition is to propose and interpret measures of (dis)similarity between JEL codes and the LDA topics resulting from the analysis.
STOREP 2018 - Whatever Has Happened to Political Economy?
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/10278/3728811
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact