University of Leicester
Browse
final version.pdf (19.09 MB)

Small Sample Image Segmentation By Coupling Convolutions and Transformers

Download (19.09 MB)
journal contribution
posted on 2024-01-04, 16:20 authored by H Qi, Huiyu Zhou, J Dong, X Dong

Compared with natural image segmentation, small sample image segmentation tasks, such as medical image segmentation and defect detection, have been less studied. Recent studies made efforts on bringing together Convolutional Neural Networks (CNNs) and Transformers in a serial or interleaved architecture in order to incorporate long-range dependencies into the features extracted using CNNs. In this study, we argue that these architectures limit the capability of the combination of CNNs and Transformers. To this end, we propose a dual-stream small sample image segmentation network, namely, the Interactive Coupling of Convolutions and Transformers Based UNet (ICCT-UNet) 1 , motivated by the success achieved using the UNet in the scenario of small sample image segmentation. Within this network, a CNN stream is paralleled with a Transformer stream while maintaining feature exchange inside each block through the proposed Window-Based Multi-head Cross-Attention (W-MHCA) mechanism. To derive an overall segmentation, the features learned by both the streams are further fused using a Residual Fusion Module (RFM). Experimental results show that the ICCT-UNet outperforms, or at least performs comparably to, its counterparts on eight sets of medical and defective images. These promising results should be attributed to the effective combination of the local and global features fulfilled by the proposed interactive coupling method.

Funding

10.13039/501100001809-National Natural Science Foundation of China (Grant Number: 42176196)

Young Taishan Scholars Program (Grant Number: tsqn201909060)

History

Author affiliation

School of Computing and Mathematical Sciences, University of Leicester

Version

  • AM (Accepted Manuscript)

Published in

IEEE Transactions on Circuits and Systems for Video Technology

Publisher

Institute of Electrical and Electronics Engineers

issn

1051-8215

Copyright date

2023

Available date

2024-01-04

Language

en

Usage metrics

    University of Leicester Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC