Air pollution is a global environmental and health issue and is also strongly interlinked with the issue of climate change. A thorough understanding of the complex nonlinear phenomena that govern the spatiotemporal variability of air pollution is still lacking, although this knowledge is essential for defining effective strategies to safeguard public health and environmental sustainability, and to counteract climate change.
In recent decades, machine learning models (MLMs) have shown great potential in the air pollution research sector due to their capability in describing complex non-linear phenomena. Moreover, thanks to interpretability methods developed in the fields of the Explainable Artificial Intelligence (XAI), MLM results can be interpreted for assessing the impact of individual factors and their interrelationships on the model output, also providing visual representations which can facilitate the comprehension of such complex phenomena.
In this study, the XGBoost algorithm and SHAP (SHapley Additive exPlanations) method have been employed to explore the influence of several driving factors, namely air pollutants (including surface ozone (O3) and fine particulate matter (PM2.5)) and meteorological parameters, on air quality index (AQI) variability.
Based on the air pollutant and meteorological data, acquired at different typologies of air quality monitoring stations over the 2018-2022 period, an XGBoost MLM has been developed to simulate the AQI temporal pattern, obtaining good model performance. Subsequently, the SHAP method has been employed to explore the importance of each driving factor and the relationship with the model output. Special focus is given to the interaction effect among driving factors on AQI.