posted on 2021-03-09, 11:04authored byM Ding, N Li, J Song, R Zhang, X Zhang, Huiyu Zhou
In recent year, due to motility and wide coverage, unmanned aerial vehicle (UAV) has been widely applied in surveillance system. Human action recognition in UAV video is essential for surveillance video understanding. However, existing action recognition methods suffer from heavy computing, which makes it hard to deploy in real applications. In this paper, a lightweight action recognition method for UAV video(LARMUV) is proposed. This method is based on TSN and adopt Mo-bileNetV3 as backbone, which greatly reduces amount of computing and parameters. Self-attention mechanism is adopted to capture temporal structure among different frames. For loss function, Focal Loss is used to putting more focus on hard, misclassified examples. Last but not least, knowledge distillation is employed to enhance the performance of our model, which transfer knowledge from a larger teacher model to student model. Experimental results on HMDB51, UCF101 and UAV dataset show that our method can achieve competitive performance compared to baseline methods while run in real-time mode.
History
Author affiliation
School of Informatics
Source
2020 IEEE 3rd International Conference on Electronics and Communication Engineering (ICECE), 14-16 Dec. 2020, Xi'An, China
Version
AM (Accepted Manuscript)
Published in
2020 IEEE 3rd International Conference on Electronics and Communication Engineering (ICECE)
Publisher
Institute of Electrical and Electronics Engineers (IEEE)