In this work we analyze the evolution in the careers of 369 Italian male middle-distance runners, born in 1988, considering their seasonal best performances in the 800, 1500, 5000 meters races during the period 2006–2019. In this context, clustering of trajectories allows to identify the possible careers of one athlete, a relevant aspect for coaches that aim at planning the future and tracking the progress of their athletes. However, differently from other disciplines, the presence of missing values for middle distance athletes is a critical aspect as they are potentially correlated with performances. On one side, drop-in and drop-out phenomena implicitly lead to a different development in each athlete’s career history. On the other side, middle distance athletes can compete in different races, an aspect which is typically related to their personal attitudes. We propose a Bayesian clustering model in which both the observed trends and the presence of missing data inform on the clustering structure. Observed trends of each race are described by group-specific state space models, useful to capture longitudinal dependence across performances of the same athlete. Information on missing values is included by means of two distinct group dependent processes: the first one describes the drop-in and drop-out phenomena in the sample; the second one describes the actual participation in the competitions by the athletes, as an index of their different attitudes. Our findings suggest that athletes who are more likely to participate in different type of races have better performances during the years.
Longitudinal clustering of athletes’ careers under informative missing data patterns
mattia stival
;mauro bernardi;
2022-01-01
Abstract
In this work we analyze the evolution in the careers of 369 Italian male middle-distance runners, born in 1988, considering their seasonal best performances in the 800, 1500, 5000 meters races during the period 2006–2019. In this context, clustering of trajectories allows to identify the possible careers of one athlete, a relevant aspect for coaches that aim at planning the future and tracking the progress of their athletes. However, differently from other disciplines, the presence of missing values for middle distance athletes is a critical aspect as they are potentially correlated with performances. On one side, drop-in and drop-out phenomena implicitly lead to a different development in each athlete’s career history. On the other side, middle distance athletes can compete in different races, an aspect which is typically related to their personal attitudes. We propose a Bayesian clustering model in which both the observed trends and the presence of missing data inform on the clustering structure. Observed trends of each race are described by group-specific state space models, useful to capture longitudinal dependence across performances of the same athlete. Information on missing values is included by means of two distinct group dependent processes: the first one describes the drop-in and drop-out phenomena in the sample; the second one describes the actual participation in the competitions by the athletes, as an index of their different attitudes. Our findings suggest that athletes who are more likely to participate in different type of races have better performances during the years.I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.