This article studies coding errors in occupational data, as the quality of this data is important but often neglected. In particular, we recoded open-ended questions on occupation for last and current job in the Dutch sample of the “Survey of Health, Ageing and Retirement in Europe” (SHARE) using a high-quality software program for ex-post coding (CASCOT software). Taking CASCOT coding as our benchmark, our results suggest that the incidence of coding errors in SHARE is high, even when the comparison is made at the level of one-digit occupational codes (28% for last job and 30% for current job). This finding highlights the complexity of occupational coding and suggests that processing errors due to miscoding should be taken into account when undertaking statistical analyses or writing econometric models. Our analysis suggests strategies to alleviate such coding errors, and we propose a set of equations that can predict error. These equations may complement coding software and improve the quality of occupational coding.

This article studies coding errors in occupational data, as the quality of this data is important but often neglected. In particular, we recoded open-ended questions on occupation for last and current job in the Dutch sample of the "Survey of Health, Ageing and Retirement in Europe" (SHARE) using a high-quality software program for ex-post coding (CASCOT software). Taking CASCOT coding as our benchmark, our results suggest that the incidence of coding errors in SHARE is high, even when the comparison is made at the level of one-digit occupational codes (28% for last job and 30% for current job). This finding highlights the complexity of occupational coding and suggests that processing errors due to miscoding should be taken into account when undertaking statistical analyses or writing econometric models. Our analysis suggests strategies to alleviate such coding errors, and we propose a set of equations that can predict error. These equations may complement coding software and improve the quality of occupational coding.

Measuring and Detecting Errors in Occupational Coding: an Analysis of SHARE Data

BELLONI, Michele;BRUGIAVINI, Agar;MESCHI, Elena Francesca;
2016-01-01

Abstract

This article studies coding errors in occupational data, as the quality of this data is important but often neglected. In particular, we recoded open-ended questions on occupation for last and current job in the Dutch sample of the "Survey of Health, Ageing and Retirement in Europe" (SHARE) using a high-quality software program for ex-post coding (CASCOT software). Taking CASCOT coding as our benchmark, our results suggest that the incidence of coding errors in SHARE is high, even when the comparison is made at the level of one-digit occupational codes (28% for last job and 30% for current job). This finding highlights the complexity of occupational coding and suggests that processing errors due to miscoding should be taken into account when undertaking statistical analyses or writing econometric models. Our analysis suggests strategies to alleviate such coding errors, and we propose a set of equations that can predict error. These equations may complement coding software and improve the quality of occupational coding.
File in questo prodotto:
File Dimensione Formato  
[Journal of Official Statistics] Measuring and Detecting Errors in Occupational Coding_ an Analysis of SHARE Data.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: Accesso libero (no vincoli)
Dimensione 429.44 kB
Formato Adobe PDF
429.44 kB Adobe PDF Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/3681958
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 10
social impact