An explainable machine learning approach for Alzheimer’s disease classification
The early diagnosis of Alzheimer’s disease (AD) presents a significant challenge due to the subtle biomarker changes often overlooked. Machine learning (ML) models offer a promising tool for identifying individuals at risk of AD. However, current research tends to prioritize ML accuracy while neglecting the crucial aspect of model explainability. The diverse nature of AD data and the limited dataset size introduce additional challenges, primarily related to high dimensionality. In this study, we leveraged a dataset obtained from the National Alzheimer’s Coordinating Center, comprising 169,408 records and 1024 features. After applying various steps to reduce the feature space. Notably, support vector machine (SVM) models trained on the selected features exhibited high performance when tested on an external dataset. SVM achieved a high F1 score of 98.9% for binary classification (distinguishing between NC and AD) and 90.7% for multiclass classification. Furthermore, SVM was able to predict AD progression over a 4-year period, with F1 scores reached 88% for binary task and 72.8% for multiclass task. To enhance model explainability, we employed two rule-extraction approaches: class rule mining and stable and interpretable rule set for classification model. These approaches generated human-understandable rules to assist domain experts in comprehending the key factors involved in AD development. We further validated these rules using SHAP and LIME models, underscoring the significance of factors such as MEMORY, JUDGMENT, COMMUN, and ORIENT in determining AD risk. Our experimental outcomes also shed light on the crucial role of the Clinical Dementia Rating tool in predicting AD.
History
Author affiliation
College of Life Sciences/Cardiovascular SciencesVersion
- VoR (Version of Record)