Texture Classification Using Pair-wise Difference Pooling Based Bilinear Convolutional Neural Networks.docx (11.63 MB)
Texture Classification Using Pair-wise Difference Pooling Based Bilinear Convolutional Neural Networks
journal contributionposted on 2020-09-10, 12:16 authored by Xinghui Dong, Huiyu Zhou, Junyu Dong
Texture is normally represented by aggregating local features based on the assumption of spatial homogeneity. Effective texture features are always the research focus even though both hand-crafted and deep learning approaches have been extensively investigated. Motivated by the success of Bilinear Convolutional Neural Networks (BCNNs) in fine-grained image recognition, we propose to incorporate the BCNN with the Pair-wise Difference Pooling (i.e. BCNN-PDP) for texture classification. The BCNN-PDP is built on top of a set of feature maps extracted at a convolutional layer of the pre-trained CNN. Compared with the outer product used by the original BCNN feature set, the pair-wise difference not only captures the pair-wise relationship between two sets of features but also encodes the difference between each pair of features. Considering the importance of the gradient data to the representation of image structures, we further generalise the BCNN-PDP feature set to two sets of feature maps computed from the original image and its gradient magnitude map respectively, i.e. the Fused BCNN-PDP (F-BCNN-PDP) feature set. In addition, the BCNN-PDP can be applied to two different CNNs and is referred to as the Asymmetric BCNN-PDP (A-BCNN-PDP). The three PDP-based BCNN feature sets can also be extracted at multiple scales. Since the dimensionality of the BCNN feature vectors is very high, we propose a new yet simple Block-wise PCA (BPCA) method in order to derive more compact feature vectors. The proposed methods are tested on seven different datasets along with 21 baseline feature sets. The results show that the proposed feature sets are superior, or at least comparable, to their counterparts across different datasets.
CitationIEEE Transactions on Image Processing ( Volume: 29 ), pp. 8776-8790. https://doi.org/10.1109/TIP.2020.3019185
Author affiliationSchool of Informatics
- AM (Accepted Manuscript)