Distance-based Weighted Transformer Network for image completion

Shamsolmoali, P; Zareapoor, M; Zhou, Huiyu; Li, X; Lu, Y

Distance-based Weighted Transformer Network for image completion

journal contribution

posted on 2023-12-11, 16:38 authored by P Shamsolmoali, M Zareapoor, Huiyu Zhou, X Li, Y Lu

The challenge of image generation has been effectively modeled as a problem of structure priors or transformation. However, existing models have unsatisfactory performance in understanding the global input image structures because of particular inherent features (for example, local inductive prior). Recent studies have shown that self-attention is an efficient modeling technique for image completion problems. In this paper, we propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image’s components. In our model, we leverage the strengths of both Convolutional Neural Networks (CNNs) and DWT blocks to enhance the image completion process. Specifically, CNNs are used to augment the local texture information of coarse priors and DWT blocks are used to recover certain coarse textures and coherent visual structures. Unlike current approaches that generally use CNNs to create feature maps, we use the DWT to encode global dependencies and compute distance-based weighted feature maps, which substantially minimizes the problem of visual ambiguities. Meanwhile, to better produce repeated textures, we introduce Residual Fast Fourier Convolution (Res-FFC) blocks to combine the encoder’s skip features with the coarse features provided by our generator. Furthermore, a simple yet effective technique is proposed to normalize the non-zero values of convolutions, and fine-tune the network layers for regularization of the gradient norms to provide an efficient training stabilizer. Extensive quantitative and qualitative experiments on three challenging datasets demonstrate the superiority of our proposed model compared to existing approaches.

Funding

National Key Research and Development Program of China under Grant No. 2020AAA0107903

History

Author affiliation

School of Computing and Mathematical Sciences, University of Leicester

Version

AM (Accepted Manuscript)

Published in

Pattern Recognition

Volume

147

Publisher

Elsevier

issn

0031-3203

Copyright date

2023

Available date

2024-11-22

Publisher DOI

https://doi.org/10.1016/j.patcog.2023.110120

Language

en

Publisher version

https://doi.org/10.1016/j.patcog.2023.110120

Usage metrics

Distance-based Weighted Transformer Network for image completion

Funding

National Key Research and Development Program of China under Grant No. 2020AAA0107903

History

Author affiliation

Version

Published in

Volume

Publisher

issn

Copyright date

Available date

Publisher DOI

Language

Publisher version

Usage metrics

Categories

Keywords

Licence

Exports