DUAL: A Dual-Stage Approach for Facial Expression Recognition Based on Contrastive Learning
journal contribution
posted on 2025-10-13, 11:27authored byA Zhu, X Jia, L Yang, Huiyu ZhouHuiyu Zhou, W Su
<p dir="ltr">Facial Expression Recognition (FER) remains a challenging task in computer vision. Recent works</p><p dir="ltr">have shown excellent performance in overall recognition accuracy, but its accuracy significantly</p><p dir="ltr">decreases when recognizing similar expressions. This is due to inter-class homogeneity and intra-</p><p dir="ltr">class heterogeneity. To address these issues, we propose a novel dual-stage network called DUAL,</p><p dir="ltr">inspired by contrastive learning. First, we increase the distance between negative samples while</p><p dir="ltr">reducing the distance between positive ones. This is achieved by dynamically updating pairs of</p><p dir="ltr">comparison samples. Second, we introduce a two-stage network architecture. The first stage uses</p><p dir="ltr">two branches to extract image features and facial keypoint features. These branches interact to learn</p><p dir="ltr">coarse-grained features through mutual guidance. The second stage focuses on fine-grained features</p><p dir="ltr">using scale-specific residual blocks. This allows the model to identify facial regions that are critical</p><p dir="ltr">for recognizing expressions. We conducted extensive experiments on multiple datasets. The results</p><p dir="ltr">show that DUAL surpasses state-of-the-art models in items of performance. Additionally, the model</p><p dir="ltr">shows high accuracy even in noisy conditions, highlighting its robustness.</p>
Funding
Key Research and Development Program of Gansu Province under Grant 24YFGA004 and in part
by National Natural Science Foundation of China (NSFC) under Grant
62462053.
History
Author affiliation
University of Leicester
College of Science & Engineering
Comp' & Math' Sciences