This article gives a narrative overview of what constitutes climatological data and their typical features, with a focus on aspects relevant to statistical modeling. We restrict the discussion to univariate spatial fields and focus on maximum likelihood estimation. To address the problem of enormous datasets, we study three common approximation schemes: tapering, direct misspecification, and composite likelihood for Gaussian and nonGaussian distributions. We focus particularly on the so-called 'sinh-arcsinh distribution', obtained through a specific transformation of the Gaussian distribution. Because it has flexible marginal distributions - possibly skewed and/or heavy-tailed - it has a wide range of applications. One appealing property of the transformation involved is the existence of an explicit inverse transformation that makes likelihood-based methods straightforward. We describe a simulation study illustrating the effects of the different approximation schemes. To the best of our knowledge, a direct comparison of tapering, direct misspecification, and composite likelihood has never been made previously, and we show that direct misspecification is inferior. In some metrics, composite likelihood has a minor advantage over tapering. We use the estimation approaches to model a high-resolution global climate change field. All simulation code is available as a Docker container and is thus fully reproducible. Additionally, the present article describes where and how to get various climate datasets. (c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license

A selective view of climatological data and likelihood estimation

Bevilacqua, M;
2022

Abstract

This article gives a narrative overview of what constitutes climatological data and their typical features, with a focus on aspects relevant to statistical modeling. We restrict the discussion to univariate spatial fields and focus on maximum likelihood estimation. To address the problem of enormous datasets, we study three common approximation schemes: tapering, direct misspecification, and composite likelihood for Gaussian and nonGaussian distributions. We focus particularly on the so-called 'sinh-arcsinh distribution', obtained through a specific transformation of the Gaussian distribution. Because it has flexible marginal distributions - possibly skewed and/or heavy-tailed - it has a wide range of applications. One appealing property of the transformation involved is the existence of an explicit inverse transformation that makes likelihood-based methods straightforward. We describe a simulation study illustrating the effects of the different approximation schemes. To the best of our knowledge, a direct comparison of tapering, direct misspecification, and composite likelihood has never been made previously, and we show that direct misspecification is inferior. In some metrics, composite likelihood has a minor advantage over tapering. We use the estimation approaches to model a high-resolution global climate change field. All simulation code is available as a Docker container and is thus fully reproducible. Additionally, the present article describes where and how to get various climate datasets. (c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S2211675322000045-main (2).pdf

accesso aperto

Tipologia: Versione dell'editore
Licenza: Creative commons
Dimensione 1.2 MB
Formato Adobe PDF
1.2 MB Adobe PDF Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5000873
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact