Related papers: Unbiased Scene Graph Generation in Videos

Unbiased Scene Graph Generation in Videos

URL: http://arxiv.org/abs/2304.00733v3
Date: Thu, 29 Jun 2023 23:52:24 GMT
Title: Unbiased Scene Graph Generation in Videos
Authors: Sayak Nag, Kyle Min, Subarna Tripathi, Amit K. Roy Chowdhury
Abstract summary: We introduce TEMPURA: TEmporal consistency and Memory-guided UnceRtainty Attenuation for unbiased dynamic SGG. TEMPURA employs object-level temporal consistencies via transformer sequence modeling, learns to synthesize unbiased relationship representations. Our method achieves significant (up to 10% in some cases) performance gain over existing methods.
Score: 36.889659781604564
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The task of dynamic scene graph generation (SGG) from videos is complicated and challenging due to the inherent dynamics of a scene, temporal fluctuation of model predictions, and the long-tailed distribution of the visual relationships in addition to the already existing challenges in image-based SGG. Existing methods for dynamic SGG have primarily focused on capturing spatio-temporal context using complex architectures without addressing the challenges mentioned above, especially the long-tailed distribution of relationships. This often leads to the generation of biased scene graphs. To address these challenges, we introduce a new framework called TEMPURA: TEmporal consistency and Memory Prototype guided UnceRtainty Attenuation for unbiased dynamic SGG. TEMPURA employs object-level temporal consistencies via transformer-based sequence modeling, learns to synthesize unbiased relationship representations using memory-guided training, and attenuates the predictive uncertainty of visual relations using a Gaussian Mixture Model (GMM). Extensive experiments demonstrate that our method achieves significant (up to 10% in some cases) performance gain over existing methods highlighting its superiority in generating more unbiased scene graphs.

Related papers

VDEGaussian: Video Diffusion Enhanced 4D Gaussian Splatting for Dynamic Urban Scenes Modeling [68.65587507038539]
We present a novel video diffusion-enhanced 4D Gaussian Splatting framework for dynamic urban scene modeling.<n>Our key insight is to distill robust, temporally consistent priors from a test-time adapted video diffusion model.<n>Our method significantly enhances dynamic modeling, especially for fast-moving objects, achieving an approximate PSNR gain of 2 dB.
arXiv Detail & Related papers (2025-08-04T07:24:05Z)
Towards Unbiased and Robust Spatio-Temporal Scene Graph Generation and Anticipation [10.678727237318503]
Impar, a novel training framework that leverages curriculum learning and loss masking to mitigate bias generation and anticipation modelling. We introduce two new tasks, Robust Spatio-Temporal Scene Graph Generation and Robust Scene Graph Anticipation, designed to evaluate the robustness of STSG models against distribution shifts.
arXiv Detail & Related papers (2024-11-20T06:15:28Z)
Retrieval Augmented Generation for Dynamic Graph Modeling [15.09162213134372]
Dynamic graph modeling is crucial for analyzing evolving patterns in various applications. Existing approaches often integrate graph neural networks with temporal modules or redefine dynamic graph modeling as a generative sequence task. We introduce the Retrieval-Augmented Generation for Dynamic Graph Modeling (RAG4DyG) framework, which leverages guidance from contextually and temporally analogous examples.
arXiv Detail & Related papers (2024-08-26T09:23:35Z)
DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs [59.434893231950205]
Dynamic graph learning aims to uncover evolutionary laws in real-world systems. We propose DyG-Mamba, a new continuous state space model for dynamic graph learning. We show that DyG-Mamba achieves state-of-the-art performance on most datasets.
arXiv Detail & Related papers (2024-08-13T15:21:46Z)
Towards Lifelong Scene Graph Generation with Knowledge-ware In-context Prompt Learning [24.98058940030532]
Scene graph generation (SGG) endeavors to predict visual relationships between pairs of objects within an image. This work seeks to address the pitfall inherent in a suite of prior relationship predictions. Motivated by the achievements of in-context learning in pretrained language models, our approach imbues the model with the capability to predict relationships.
arXiv Detail & Related papers (2024-01-26T03:43:22Z)
FloCoDe: Unbiased Dynamic Scene Graph Generation with Temporal Consistency and Correlation Debiasing [14.50214193838818]
FloCoDe: Flow-aware Temporal and Correlation Debiasing with uncertainty attenuation for unbiased dynamic scene graphs. We propose correlation debiasing and a correlation-based loss to learn unbiased relation representations for long-tailed classes.
arXiv Detail & Related papers (2023-10-24T14:59:51Z)
Local-Global Information Interaction Debiasing for Dynamic Scene Graph Generation [51.92419880088668]
We propose a novel DynSGG model based on multi-task learning, DynSGG-MTL, which introduces the local interaction information and global human-action interaction information. Long-temporal human actions supervise the model to generate multiple scene graphs that conform to the global constraints and avoid the model being unable to learn the tail predicates.
arXiv Detail & Related papers (2023-08-10T01:24:25Z)
EasyDGL: Encode, Train and Interpret for Continuous-time Dynamic Graph Learning [92.71579608528907]
This paper aims to design an easy-to-use pipeline (termed as EasyDGL) composed of three key modules with both strong ability fitting and interpretability. EasyDGL can effectively quantify the predictive power of frequency content that a model learn from the evolving graph data.
arXiv Detail & Related papers (2023-03-22T06:35:08Z)
Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs [65.18780403244178]
We propose a continuous model to forecast Multivariate Time series with dynamic Graph neural Ordinary Differential Equations (MTGODE) Specifically, we first abstract multivariate time series into dynamic graphs with time-evolving node features and unknown graph structures. Then, we design and solve a neural ODE to complement missing graph topologies and unify both spatial and temporal message passing.
arXiv Detail & Related papers (2022-02-17T02:17:31Z)
TCL: Transformer-based Dynamic Graph Modelling via Contrastive Learning [87.38675639186405]
We propose a novel graph neural network approach, called TCL, which deals with the dynamically-evolving graph in a continuous-time fashion. To the best of our knowledge, this is the first attempt to apply contrastive learning to representation learning on dynamic graphs.
arXiv Detail & Related papers (2021-05-17T15:33:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.