Statistical Methods for Safe Artificial Intelligence

Paolo Giudici

We are working on a new version of the website! To complete the upgrade, Sciforum will be unavailable on Saturday 18 July from 09:00 to 15:00 CEST. Thank you for your understanding.

Previous Article in event

A Nonparametric Approach to Performability Analysis in Semi-Markov Systems

Next Article in event

Optimal Control Strategies for Predator–Prey Systems with Bang–Bang and Quadratic Control Terms.

Next Article in session

A Network-Tuned Asset Allocation Framework: Integrating TMFG Filtering and Shapley-Valued Stock Selection for Emerging Markets

Statistical Methods for Safe Artificial Intelligence

Paolo Giudici

¹ Department of economics, University of Pavia, Pavia, 27100, Italy

Academic Editor: Antonio Di Crescenzo

Published: 04 June 2026 by MDPI in The 2nd International Online Conference on Mathematics and Applications session Statistics and Operational Research

Abstract:

Being able to make reliable predictions is a crucial task in many Artificial Intelligence problems.
In mathematical terms, a prediction can be formulated as a classification problem, where any input data is associated with a class, or as a regression task, where we search for a suitable function that fits the data.
These two classes of prediction models serve different purposes and are useful in their own regard: classification is applied to medical diagnosis, credit rating, and test grading, while regression is used to predict blood pressure, house prices, or energy consumption.
Current metrics, such as MSE for regression and AUC (Area Under the Curve) for classification, appear largely unrelated and problem-specific.
We propose a novel family of metrics, grounded in Cramér and energy distances applicable to all types of response variables: continuous, ordinal, nominal, and extendable to multivariate settings. We show the effectiveness and versatility of our metrics in a range of real applications, from finance, health care, to human resource management. We also show that metrics can be extended to assess not only accuracy but also explainability and robustness, in a joint AI assessment framework. To derive an integrated metric, we consider alternative aggregation schemes: from weighted means to decision theoretic methods.

Keywords: AI risk metrics, AI governance, Accuracy, Explainability, Robustness

9 Reads
0 Recommendations

Paolo Giudici