Mortality modelling has advanced significantly in recent years, partly thanks to the development of machine learning and data science. The introduction of the “transformer” model by Wang et al. (2024), highly effective in processing long sequences of information, has greatly improved the precision of mortality forecasts, at least in comparison to parametric mortality models such as Li and Lee or more recent benchmarks.
We propose an innovative approach that integrates the dependence between countries, in terms of mortality, directly into the transformer model. This involves constructing a similarity matrix that combines mortality data and exogenous variables that impact mortality, such as GDP or alcohol consumption, using the method proposed in Gouthon and Milhaud (2025). This matrix helps quantify the degree of similarity between countries, based on their mortality data and related covariates. The proposed Transformer architecture uses the encoder to process temporal dependence between series and the decoder to address inter-country dependence via the similarity matrix. Hence, the proposed model simultaneously captures the long-term temporal dependencies specific to each country and the structural correlations between them.
This new approach performs well in a comprehensive comparative study against classical benchmarks such as Li and Lee, and against more recent deep learning-based models such as the long short-term memory network, the convolutional neural network, or the basic transformer model.