Using machine learning (ML) algorithms based on voice disorders to identify Parkinson's disease

¹ Department of Electronics and Communication Engineering, Potti Sriramulu Chalavadi Mallikarjuna Rao College of Engineering and Technology (Autonomous)

Academic Editor: Eugenio Vocaturo

Published: 02 December 2024 by MDPI in The 5th International Electronic Conference on Applied Sciences session Computing and Artificial Intelligence

Abstract:

Parkinson’s disease, described by James Parkinson, is a neurological syndrome affecting the central nervous system, leading to issues such as speech difficulties, tremors, and impaired movement. It is a prevalent neurological condition characterized by motor and cognitive impairments, affecting approximately 10 million people globally, according to WHO. Early diagnosis is critical, as delayed detection may lead to irreversible damage. Speech, being affected by motor control depletion, serves as a valuable tool for diagnosing Parkinson’s disease. This work presents a machine learning-based approach for the systematic detection of Parkinson’s disease using speech features. The dataset, obtained from the UCI Machine Learning Repository, consisting of biomedical voice measurements derived from speech recordings, and including data from 195 individuals (147 with Parkinson’s and 48 healthy controls), was analyzed, incorporating 21 features derived from speech recordings. In this work, several classification algorithms in machine learning were utilized. Specifically, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Logistic Regression, AdaBoost, and Random Forest were implemented and evaluated using metrics such as accuracy, precision, recall, F1-score, and ROC-AUC curves. From the experiment results, K-Nearest Neighbors (KNN) emerged as the best performer, achieving 98.31% accuracy, ideal precision for the normal class (1.00), and ideal recall for Parkinson’s cases (1.00), ensuring no missed diagnoses. Its F1-score of 0.98 highlights a strong balance between precision and recall. While AdaBoost matched KNN in accuracy, its slightly lower recall for Parkinson’s cases (0.97) makes K-Nearest Neighbors (KNN) the preferred choice. Consequently, K-Nearest Neighbors (KNN) is proposed as the most reliable model for robust and accurate Parkinson’s disease classification, showing outstanding performance compared to other models.

Keywords: Parkinson’s Disease, Speech Recordings, Machine Learning, K-Nearest Neighbors (KNN), AdaBoost, Dysphonia.

View Poster