Unbiased Scene Graph Generation in Videos
- URL: http://arxiv.org/abs/2304.00733v3
- Date: Thu, 29 Jun 2023 23:52:24 GMT
- Title: Unbiased Scene Graph Generation in Videos
- Authors: Sayak Nag, Kyle Min, Subarna Tripathi, Amit K. Roy Chowdhury
- Abstract summary: We introduce TEMPURA: TEmporal consistency and Memory-guided UnceRtainty Attenuation for unbiased dynamic SGG.
TEMPURA employs object-level temporal consistencies via transformer sequence modeling, learns to synthesize unbiased relationship representations.
Our method achieves significant (up to 10% in some cases) performance gain over existing methods.
- Score: 36.889659781604564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The task of dynamic scene graph generation (SGG) from videos is complicated
and challenging due to the inherent dynamics of a scene, temporal fluctuation
of model predictions, and the long-tailed distribution of the visual
relationships in addition to the already existing challenges in image-based
SGG. Existing methods for dynamic SGG have primarily focused on capturing
spatio-temporal context using complex architectures without addressing the
challenges mentioned above, especially the long-tailed distribution of
relationships. This often leads to the generation of biased scene graphs. To
address these challenges, we introduce a new framework called TEMPURA: TEmporal
consistency and Memory Prototype guided UnceRtainty Attenuation for unbiased
dynamic SGG. TEMPURA employs object-level temporal consistencies via
transformer-based sequence modeling, learns to synthesize unbiased relationship
representations using memory-guided training, and attenuates the predictive
uncertainty of visual relations using a Gaussian Mixture Model (GMM). Extensive
experiments demonstrate that our method achieves significant (up to 10% in some
cases) performance gain over existing methods highlighting its superiority in
generating more unbiased scene graphs.
Related papers
- State Space Models on Temporal Graphs: A First-Principles Study [30.531930200222423]
Research on deep graph learning has shifted from static graphs to temporal graphs in response to real-world complex systems that exhibit dynamic behaviors.
Sequence models such as RNNs or Transformers have long been the predominant backbone networks for modeling such temporal graphs.
We develop GraphSSM, a graph state space model for modeling the dynamics of temporal graphs.
arXiv Detail & Related papers (2024-06-03T02:56:11Z) - Towards Lifelong Scene Graph Generation with Knowledge-ware In-context
Prompt Learning [24.98058940030532]
Scene graph generation (SGG) endeavors to predict visual relationships between pairs of objects within an image.
This work seeks to address the pitfall inherent in a suite of prior relationship predictions.
Motivated by the achievements of in-context learning in pretrained language models, our approach imbues the model with the capability to predict relationships.
arXiv Detail & Related papers (2024-01-26T03:43:22Z) - FloCoDe: Unbiased Dynamic Scene Graph Generation with Temporal Consistency and Correlation Debiasing [14.50214193838818]
FloCoDe: Flow-aware Temporal and Correlation Debiasing with uncertainty attenuation for unbiased dynamic scene graphs.
We propose correlation debiasing and a correlation-based loss to learn unbiased relation representations for long-tailed classes.
arXiv Detail & Related papers (2023-10-24T14:59:51Z) - Local-Global Information Interaction Debiasing for Dynamic Scene Graph
Generation [51.92419880088668]
We propose a novel DynSGG model based on multi-task learning, DynSGG-MTL, which introduces the local interaction information and global human-action interaction information.
Long-temporal human actions supervise the model to generate multiple scene graphs that conform to the global constraints and avoid the model being unable to learn the tail predicates.
arXiv Detail & Related papers (2023-08-10T01:24:25Z) - Transform-Equivariant Consistency Learning for Temporal Sentence
Grounding [66.10949751429781]
We introduce a novel Equivariant Consistency Regulation Learning framework to learn more discriminative representations for each video.
Our motivation comes from that the temporal boundary of the query-guided activity should be consistently predicted.
In particular, we devise a self-supervised consistency loss module to enhance the completeness and smoothness of the augmented video.
arXiv Detail & Related papers (2023-05-06T19:29:28Z) - EasyDGL: Encode, Train and Interpret for Continuous-time Dynamic Graph
Learning [114.72818205974285]
This paper aims to design an easy-to-use pipeline (termed as EasyDGL) composed of three key modules with both strong ability fitting and interpretability.
EasyDGL can effectively quantify the predictive power of frequency content that a model learn from the evolving graph data.
arXiv Detail & Related papers (2023-03-22T06:35:08Z) - CAME: Context-aware Mixture-of-Experts for Unbiased Scene Graph
Generation [10.724516317292926]
We present a simple yet effective method called Context-Aware Mixture-of-Experts (CAME) to improve the model diversity and alleviate the biased scene graph generator.
We have conducted extensive experiments on three tasks on the Visual Genome dataset to show that came achieved superior performance over previous methods.
arXiv Detail & Related papers (2022-08-15T10:39:55Z) - Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs [65.18780403244178]
We propose a continuous model to forecast Multivariate Time series with dynamic Graph neural Ordinary Differential Equations (MTGODE)
Specifically, we first abstract multivariate time series into dynamic graphs with time-evolving node features and unknown graph structures.
Then, we design and solve a neural ODE to complement missing graph topologies and unify both spatial and temporal message passing.
arXiv Detail & Related papers (2022-02-17T02:17:31Z) - TCL: Transformer-based Dynamic Graph Modelling via Contrastive Learning [87.38675639186405]
We propose a novel graph neural network approach, called TCL, which deals with the dynamically-evolving graph in a continuous-time fashion.
To the best of our knowledge, this is the first attempt to apply contrastive learning to representation learning on dynamic graphs.
arXiv Detail & Related papers (2021-05-17T15:33:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.