This paper revisits a recent study by Posen and Levinthal (Man Sci 58:587–601, 2012) on the exploration/exploitation tradeoff for a multi- armed bandit problem, where the reward probabilities undergo random shocks. We show that their analysis suffers two shortcomings: it assumes that learning is based on stale evidence, and it overlooks the steady state. We let the learning rule endogenously discard stale evidence, and we perform the long run analyses. The comparative study demonstrates that some of their conclusions must be qualified.

Pack light on the move: Exploitation and exploration in a dynamic environment

LI CALZI, Marco;
2013-01-01

Abstract

This paper revisits a recent study by Posen and Levinthal (Man Sci 58:587–601, 2012) on the exploration/exploitation tradeoff for a multi- armed bandit problem, where the reward probabilities undergo random shocks. We show that their analysis suffers two shortcomings: it assumes that learning is based on stale evidence, and it overlooks the steady state. We let the learning rule endogenously discard stale evidence, and we perform the long run analyses. The comparative study demonstrates that some of their conclusions must be qualified.
2013
Artificial Economics and Self Organization
File in questo prodotto:
File Dimensione Formato  
Pack-Light.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: Accesso gratuito (solo visione)
Dimensione 346.08 kB
Formato Adobe PDF
346.08 kB Adobe PDF Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/38119
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact