posted on 2023-09-04, 15:20authored byP Shamsolmoali, M Zareapoor, Huiyu Zhou, D Tao, X Li
<p>Deep generative models have demonstrated success-ful applications in learning non-linear data distributions througha number of latent variables and these models use a non-linearfunction (generator) to map latent samples into the data space.On the other hand, the non-linearity of the generator impliesthat the latent space shows an unsatisfactory projection of thedata space, which results in poor representation learning. Thisweak projection, however, can be addressed by a Riemannianmetric, and we show that geodesics computation and accurateinterpolations between data samples on the Riemannian manifoldcan substantially improve the performance of deep generativemodels. In this paper, a Variational spatial-Transformer AutoEn-coder (VTAE) is proposed to minimize geodesics on a Riemannianmanifold and improve representation learning. In particular, wecarefully design the variational autoencoder with an encodedspatial-Transformer to explicitly expand the latent variable modelto data on a Riemannian manifold, and obtain global contextmodelling. Moreover, to have smooth and plausible interpo-lations while traversing between two different objects’ latentrepresentations, we propose a geodesic interpolation networkdifferent from the existing models that use linear interpolationwith inferior performance. Experiments on benchmarks showthat our proposed model can improve predictive accuracy andversatility over a range of computer vision tasks, including imageinterpolations, and reconstructions.</p>
History
Author affiliation
School of Computing and Mathematical Sciences, University of Leicester