Version 2 2023-12-12, 12:51Version 2 2023-12-12, 12:51
Version 1 2023-12-11, 17:19Version 1 2023-12-11, 17:19
journal contribution
posted on 2023-12-12, 12:51authored byDaqi Liu, Miroslaw Bober, Josef Kittler
<p>Scene graph generation aims to interpret an input image by explicitly modelling the objects contained therein and their relationships. In existing methods the problem is predominantly solved by message passing neural network models. Unfortunately, in such models, the variational distributions generally ignore the structural dependencies among the output variables, and most of the scoring functions only consider pairwise dependencies. This can lead to inconsistent interpretations. In this article, we propose a novel neural belief propagation method seeking to replace the traditional mean field approximation with a structural Bethe approximation. To find a better bias-variance trade-off, higher-order dependencies among three or more output variables are also incorporated into the relevant scoring function. The proposed method achieves the state-of-the-art performance on various popular scene graph generation benchmarks. </p>
Funding
U.K. Defence Science and Technology Laboratory
10.13039/501100000266-Engineering and Physical Sciences Research Council
10.13039/100014036-Multidisciplinary University Research Initiative (Grant Number: EP/R018456/1)
History
Author affiliation
School of Psychology and Vision Science, University of Leicester
Version
VoR (Version of Record)
Published in
IEEE Transactions on Pattern Analysis and Machine Intelligence
Volume
45
Issue
8
Pagination
10161 - 10172
Publisher
Institute of Electrical and Electronics Engineers (IEEE)