University of Leicester
Browse
IEEE TIP.pdf (4.56 MB)

VTAE: Variational transformer autoencoder with manifolds learning

Download (4.56 MB)
journal contribution
posted on 2023-09-04, 15:20 authored by P Shamsolmoali, M Zareapoor, Huiyu Zhou, D Tao, X Li

Deep generative models have demonstrated success-ful applications in learning non-linear data distributions througha number of latent variables and these models use a non-linearfunction (generator) to map latent samples into the data space.On the other hand, the non-linearity of the generator impliesthat the latent space shows an unsatisfactory projection of thedata space, which results in poor representation learning. Thisweak projection, however, can be addressed by a Riemannianmetric, and we show that geodesics computation and accurateinterpolations between data samples on the Riemannian manifoldcan substantially improve the performance of deep generativemodels. In this paper, a Variational spatial-Transformer AutoEn-coder (VTAE) is proposed to minimize geodesics on a Riemannianmanifold and improve representation learning. In particular, wecarefully design the variational autoencoder with an encodedspatial-Transformer to explicitly expand the latent variable modelto data on a Riemannian manifold, and obtain global contextmodelling. Moreover, to have smooth and plausible interpo-lations while traversing between two different objects’ latentrepresentations, we propose a geodesic interpolation networkdifferent from the existing models that use linear interpolationwith inferior performance. Experiments on benchmarks showthat our proposed model can improve predictive accuracy andversatility over a range of computer vision tasks, including imageinterpolations, and reconstructions.

History

Author affiliation

School of Computing and Mathematical Sciences, University of Leicester

Version

  • AM (Accepted Manuscript)

Published in

IEEE Transactions on Image Processing

Volume

32

Pagination

4486 - 4500

Publisher

Institute of Electrical and Electronics Engineers

issn

1941-0042

Copyright date

2023

Available date

2023-09-04

Language

en

Usage metrics

    University of Leicester Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC