Please login first
A comparative analysis of metric combinations in face verification with machine learning techniques
* 1, 2 , 3 , 3
1  Department of Computer Science and Artificial Intelligence, University of Alicante, Alicante, Spain
2  ValgrAI – Valencian Graduate School and Research Network for Artificial Intelligence, Comunidad Valenciana, Spain
3  Institute for Computer Research, University of Alicante, Alicante, Spain
Academic Editor: Francesco Dell'olio

Abstract:

Face verification, a critical task in computer vision with significant implications for security, surveillance, and biometric applications, involves determining whether two facial images represent the same individual, even when captured under varying conditions such as lighting changes, pose, or facial expression variations. Despite recent advances in the field, achieving a high accuracy in face verification remains challenging, especially in scenarios involving occlusions or poor image quality. Improving the methods used to compare facial embeddings has become a key area of research for developing more robust and reliable face verification systems. Traditionally, metrics such as L1, L2, and cosine similarity have been employed to compare facial embeddings in computer vision. However, when used in isolation, each metric has inherent limitations, particularly in its ability to generalize across complex and diverse datasets. This study explores the effectiveness of combining various metrics to enhance the comparison of facial embeddings. We aim to improve accuracy by leveraging state-of-the-art CNN-based face verification methods, including AdaFace and ArcFace, as well as advanced vision transformer-based approaches such as Swin transformers. To achieve this, we developed a range of combinations of metrics using machine learning techniques, including Logistic Regression, k-Nearest Neighbors, Support Vector Machines, LightGBM, and XGBoost. The CASIA-WebFace dataset was used for training metric-combining models, and the BUPT-BalancedFace dataset was used for evaluation, ensuring balanced comparisons across demographic groups. The experimental results showed that while cosine similarity outperformed L1 and L2 metrics, the combination of multiple metrics was more effective than models relying on a single metric in both CNN-based facial verification and vision transformer-based methods. CNN-based models were more effective than transformer-based ones. The combined strategies resulted in models that achieved a better balance among recall, precision, and F1-score. In particular, the accuracy of these models increased by 1.1% compared to the best models that used a single metric.

Keywords: face verification; computer vision; facial embeddings; machine learning; metric combination
Comments on this paper
Currently there are no comments available.



 
 
Top