Development of a Predictive Model for Mild Cognitive Impairment in Parkinson's Disease with Normal Cognition by Combining Kernel-based Machine Learning and C5.0

Haewon Byeon

doi:10.3390/ASEC2021-11147

Abstract:

It has been reported that mild cognitive impairment (MCI), known as the preclinical phase of dementia, may last up to seven years and appropriate therapeutic interventions in the MCI stage can delay the progression to dementia approximately five years. As a result, many studies have focused on detecting MCI, known as an intermediate stage between normal aging and Alzheimer's dementia, as soon as possible. As longitudinal studies on Parkinson's disease have reported that patients with Parkinson's disease frequently suffer from cognitive impairment, recent studies have paid more attention to mild cognitive impairment in Parkinson's disease (PDMCI) as well as Alzheimer’s MCI. Although PDMCI occurs frequently in patients with Parkinson's disease, the characteristics of PDMCI are known much less than those of Alzheimer's MCI and those of vascular MCI. Although a number of previous studies have reported that the most critical characteristic of PDMCI is executive function impairment due to frontal lobe dysfunction found at an early stage, it is hard to detect it only with the degree of executive function because early-stage MCI due to Alzheimer disease or vascular dementia shows executive function impairment. In particular, since Parkinson's disease progresses slowly and symptoms appear little by little, patients and caregivers can perceive the cognitive problems caused by PDMCI as the cognitive frailty in the normal aging process. Therefore, it is hard to diagnose it early. MCI is diagnosed based on an interview, evaluation of cognitive function through standardized neuropsychological tests, and brain imaging. However, brain imaging has limitations to be used for early diagnosis purposes because although it can detect the presence of cerebrovascular disease and brain atrophy, it can find them only when these symptoms are very advanced. Therefore, neuropsychological tests also evaluating cognitive function are known to be effective screening tests for detecting MCI early. On the other hand, studies in the medical field have steadily predicted the risk probability or high-risk groups of a disease using data mining in recent years. However, it is challenging to accurately predict diseases with single machine learning (learner). For example, the artificial neural network technique has a limitation of not being able to explain the derived results but it offers high prediction accuracy. On the other hand, the decision tree technique allows clinicians to easily interpret the results derived from it, but it is exposed to a higher overfitting risk than other machine learning algorithms such as SVM, the results of it can be altered by the type and order of input variables, and the accuracy of it can be lowered depending on them. To overcome these limitations, a hybrid model combining artificial neural network and decision tree model has been used recently to develop a model that has higher predictive power and explanatory power compared to single machine learning. This study developed a PDMCI predictive model considering health behaviors, environmental factors, medical history, physical function, depression, and cognitive level using a hybrid model combining C-SVM and C5.0 and provided baseline data for the prevention and early management of Parkinson's dementia. This study analyzed 185 patients with Parkinson's disease (75 Parkinson's disease patients with normal cognition, and 110 patients with PDMCI) after being approved by the Research Ethics Review Committee of the National Biobank of Korea. This study used 48 variables (diagnostic data), including motor symptoms of Parkinson's disease, non-motor symptoms of Parkinson's disease, and sleep disorders, as explanatory variables. This study chose “C5.0” implemented by Kuhn et al. (2013) for the decision tree algorithm and “kernel-based machine learning (kernlab)” implemented by Karatzoglou et al. (2016) for the SVM to develop a PDMCI predictive model. The kernlab algorithm includes a polynomial kernel function (polydot), a linear kernel function (vanilladot), and a radial basis kernel function (RBFdot) that enable nonlinear SVM analysis. This study developed seven machine learning models using blending (3 hybrid models (polydot+C5.0, vanilladot+C5.0, and RBFdot+C5.0) and four single machine learning models (polydot, vanilladot, RBFdot, and C5.0)). This study compared the predictive performance of these developed models using the 10-folds cross-validation method. The results of this study showed that the RBFdot+C5.0 was the model with the best performance to predict PDMCI in Parkinson’s disease patients with normal cognition (AUC=0.88) among the seven machine learning models. It is necessary to develop a customized screening program for detecting PDMCI in Parkinson’s disease patients with normal cognition early based on the results of this study.