Neural Belief Propagation for Scene Graph Generation
Scene graph generation aims to interpret an input image by explicitly modelling the objects contained therein and their relationships. In existing methods the problem is predominantly solved by message passing neural network models. Unfortunately, in such models, the variational distributions generally ignore the structural dependencies among the output variables, and most of the scoring functions only consider pairwise dependencies. This can lead to inconsistent interpretations. In this article, we propose a novel neural belief propagation method seeking to replace the traditional mean field approximation with a structural Bethe approximation. To find a better bias-variance trade-off, higher-order dependencies among three or more output variables are also incorporated into the relevant scoring function. The proposed method achieves the state-of-the-art performance on various popular scene graph generation benchmarks.
Funding
U.K. Defence Science and Technology Laboratory
10.13039/501100000266-Engineering and Physical Sciences Research Council
10.13039/100014036-Multidisciplinary University Research Initiative (Grant Number: EP/R018456/1)
History
Author affiliation
School of Psychology and Vision Science, University of LeicesterVersion
- VoR (Version of Record)