In this contribution, we applied a multi-stage machine learning (ML) framework to map daily values of nitrogen dioxide (NO2) and particulate matter (PM10 and PM2.5) at a 1 km2 resolution over Great Britain for the period 2003-2021. The process combined ground monitoring observations, satellite-derived products, climate reanalyses and chemical transport model datasets, and traffic and land-use data. Each feature was harmonized to 1 km resolution and extracted at monitoring sites. Models used single and ensemble-based algorithms featuring random forests (RF), extreme gradient boosting (XGB), light gradient boosting machine (LGBM), as well as lasso and ridge regression. The various stages focused on augmenting PM2.5 using co-occurring PM10 values, gap-filling aerosol optical depth and columnar NO2 data obtained from satellite instruments, and finally the training of an ensemble model and the prediction of daily values across the whole geographical domain (2003-2021). Results show a good ensemble model performance, calculated through a ten-fold monitor-based cross-validation procedure, with an average R2 of 0.690 (range 0.611-0.792) for NO2, 0.704 (0.609-0.786) for PM10, and 0.802 (0.746-0.888) for PM2.5. Reconstructed pollution levels decreased markedly within the study period, with a stronger reduction in the latter eight years. The pollutants exhibited different spatial patterns, while NO2 rose in close proximity to high-traffic areas, PM demonstrated variation at a larger scale. The resulting 1 km2 spatially resolved daily datasets allow for linkage with health data across Great Britain over nearly two decades, thus contributing to extensive, extended, and detailed research on the long-and short-term health effects of air pollution.
High resolution mapping of nitrogen dioxide and particulate matter in Great Britain (2003–2021) with multi-stage data reconstruction and ensemble machine learning methods
Mistry, Malcolm;
2024-01-01
Abstract
In this contribution, we applied a multi-stage machine learning (ML) framework to map daily values of nitrogen dioxide (NO2) and particulate matter (PM10 and PM2.5) at a 1 km2 resolution over Great Britain for the period 2003-2021. The process combined ground monitoring observations, satellite-derived products, climate reanalyses and chemical transport model datasets, and traffic and land-use data. Each feature was harmonized to 1 km resolution and extracted at monitoring sites. Models used single and ensemble-based algorithms featuring random forests (RF), extreme gradient boosting (XGB), light gradient boosting machine (LGBM), as well as lasso and ridge regression. The various stages focused on augmenting PM2.5 using co-occurring PM10 values, gap-filling aerosol optical depth and columnar NO2 data obtained from satellite instruments, and finally the training of an ensemble model and the prediction of daily values across the whole geographical domain (2003-2021). Results show a good ensemble model performance, calculated through a ten-fold monitor-based cross-validation procedure, with an average R2 of 0.690 (range 0.611-0.792) for NO2, 0.704 (0.609-0.786) for PM10, and 0.802 (0.746-0.888) for PM2.5. Reconstructed pollution levels decreased markedly within the study period, with a stronger reduction in the latter eight years. The pollutants exhibited different spatial patterns, while NO2 rose in close proximity to high-traffic areas, PM demonstrated variation at a larger scale. The resulting 1 km2 spatially resolved daily datasets allow for linkage with health data across Great Britain over nearly two decades, thus contributing to extensive, extended, and detailed research on the long-and short-term health effects of air pollution.File | Dimensione | Formato | |
---|---|---|---|
1-s2.0-S1309104224002496-main.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
Creative commons
Dimensione
5.29 MB
Formato
Adobe PDF
|
5.29 MB | Adobe PDF | Visualizza/Apri |
I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.