TF-FusNet: A Novel Framework for Parkinson’s Disease Detection via Time-Frequency Domain Fusion
Speech recognition technology is an important means to help patients with Parkinson’s Disease (PD) speech disorder to control the disease, which can effectively improve the objectivity of doctors’ diagnoses and the quality of life of patients. However, the existing research lacks the comprehensive consideration of time-frequency domain information. To overcome this problem, we propose a Time-Frequency Fusion Network(TF-FusNet) model, capturing long temporal dependencies and frequency domain representation information of speech. Firstly, we design a time-frequency domain feature fusion approach to combine the temporal information and frequency domain representations for speech classification. Secondly, we extract five different speech cepstral coefficient features to provide multiple perspectives of disease understanding. Thirdly, we design a decision layer based on a voting strategy for finalizing the speech classification results. Experiments on the London King’s College Mobile Device Voice Recording (MDVR-KCL) and the Italian Parkinson’s Voice Database (IPVS) datasets show that TF-FusNet achieves classification accuracies of 98.45% and 99.63%, respectively, outperforming other SOTA models.
History
Author affiliation
College of Science & Engineering Comp' & Math' SciencesSource
The 13th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC 2024). Hangzhou, China, November 1-3, 2024.Version
- AM (Accepted Manuscript)