Fingerspelling Recognition by 12-Layer CNN with Stochastic Pooling
Fingerspelling is a method of spelling words via hand movements. This study aims to propose a novel fingerspelling recognition system. We use 1320 fingerspelling images in our dataset. Our method is based on the convolutional neural network (CNN) model. We propose a 12-layer CNN as the backbone. Particularly, stochastic pooling (SP) is used to help solve the problems caused by max pooling or average pooling. In addition, an improved 20-way data augmentation method is proposed to circumvent overfitting. Our method is dubbed CNNSP. The results show that our CNNSP method achieved a micro-averaged F1 (MAF) score of 90.04 ± 0.82%. In contrast, the MAFs of l2-pooling, average pooling, and max pooling are 86.21 ± 1.12%, 87.54 ± 1.39%, and 89.07 ± 0.78%, respectively. Our CNNSP attains better results than eight state-of-the-art fingerspelling recognition methods. Besides, the SP is better than l2-pooling, average pooling, and max pooling.
Funding
Hope Foundation for Cancer Research, UK (RM60G0680)
Royal Society International Exchanges Cost Share Award, UK (RP202G0230)
Medical Research Council Confidence in Concept Award, UK (MC_PC_17171)
Global Challenges Research Fund (GCRF), UK (P202PF11)
Sino-UK Industrial Fund, UK (RP202G0289)
British Heart Foundation Accelerator Award, UK (AA/18/3/34220)
History
Citation
Mobile Netw Appl (2022). https://doi.org/10.1007/s11036-021-01900-8Author affiliation
School of Computing and Mathematical Sciences, University of LeicesterVersion
- AM (Accepted Manuscript)