Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation
- URL: http://arxiv.org/abs/2103.05271v2
- Date: Wed, 10 Mar 2021 05:20:48 GMT
- Title: Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation
- Authors: Gengcong Yang, Jingyi Zhang, Yong Zhang, Baoyuan Wu, Yujiu Yang
- Abstract summary: We argue that visual relationships are often semantically ambiguous.
The ambiguity naturally leads to the issue of emphimplicit multi-label, motivating the need for diverse predictions.
In this work, we propose a novel plug-and-play Probabilistic Uncertainty Modeling (PUM) module.
- Score: 38.30703975408238
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To generate "accurate" scene graphs, almost all existing methods predict
pairwise relationships in a deterministic manner. However, we argue that visual
relationships are often semantically ambiguous. Specifically, inspired by
linguistic knowledge, we classify the ambiguity into three types: Synonymy
Ambiguity, Hyponymy Ambiguity, and Multi-view Ambiguity. The ambiguity
naturally leads to the issue of \emph{implicit multi-label}, motivating the
need for diverse predictions. In this work, we propose a novel plug-and-play
Probabilistic Uncertainty Modeling (PUM) module. It models each union region as
a Gaussian distribution, whose variance measures the uncertainty of the
corresponding visual content. Compared to the conventional deterministic
methods, such uncertainty modeling brings stochasticity of feature
representation, which naturally enables diverse predictions. As a byproduct,
PUM also manages to cover more fine-grained relationships and thus alleviates
the issue of bias towards frequent relationships. Extensive experiments on the
large-scale Visual Genome benchmark show that combining PUM with newly proposed
ResCAGCN can achieve state-of-the-art performances, especially under the mean
recall metric. Furthermore, we prove the universal effectiveness of PUM by
plugging it into some existing models and provide insightful analysis of its
ability to generate diverse yet plausible visual relationships.
Related papers
- CONTESTS: a Framework for Consistency Testing of Span Probabilities in Language Models [16.436592723426305]
It is unclear whether language models produce the same value for different ways of assigning joint probabilities to word spans.
Our work introduces a novel framework, ConTestS, involving statistical tests to assess score consistency across interchangeable completion and conditioning orders.
arXiv Detail & Related papers (2024-09-30T06:24:43Z) - Prototype-based Aleatoric Uncertainty Quantification for Cross-modal
Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space.
However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts.
We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z) - Uncertainty-Aware Pedestrian Trajectory Prediction via Distributional Diffusion [26.715578412088327]
We present a model-agnostic uncertainty-aware pedestrian trajectory prediction framework.
Unlike previous studies, we translate the predictiveity to explicit distributions, allowing it to generate plausible future trajectories.
Our framework is compatible with different neural net architectures.
arXiv Detail & Related papers (2023-03-15T04:58:43Z) - Bayesian Networks for the robust and unbiased prediction of depression
and its symptoms utilizing speech and multimodal data [65.28160163774274]
We apply a Bayesian framework to capture the relationships between depression, depression symptoms, and features derived from speech, facial expression and cognitive game data collected at thymia.
arXiv Detail & Related papers (2022-11-09T14:48:13Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z) - Learning Disentangled Representations with Latent Variation
Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations.
Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs.
We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.