FDSG: Forecasting Dynamic Scene Graphs
- URL: http://arxiv.org/abs/2506.01487v2
- Date: Fri, 18 Jul 2025 13:05:43 GMT
- Title: FDSG: Forecasting Dynamic Scene Graphs
- Authors: Yi Yang, Yuren Cong, Hao Cheng, Bodo Rosenhahn, Michael Ying Yang,
- Abstract summary: We propose a novel framework that predicts future entity labels, bounding boxes, and relationships for unobserved frames.<n>A temporal aggregation module further refines predictions by integrating forecasted and observed information via crossattention.<n>Experiments on Action Genome show that FDSG outperforms state-of-the-art methods on dynamic scene graph generation, scene graph anticipation, and scene graph forecasting.
- Score: 41.18167591493808
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dynamic scene graph generation extends scene graph generation from images to videos by modeling entity relationships and their temporal evolution. However, existing methods either generate scene graphs from observed frames without explicitly modeling temporal dynamics, or predict only relationships while assuming static entity labels and locations. These limitations hinder effective extrapolation of both entity and relationship dynamics, restricting video scene understanding. We propose Forecasting Dynamic Scene Graphs (FDSG), a novel framework that predicts future entity labels, bounding boxes, and relationships, for unobserved frames, while also generating scene graphs for observed frames. Our scene graph forecast module leverages query decomposition and neural stochastic differential equations to model entity and relationship dynamics. A temporal aggregation module further refines predictions by integrating forecasted and observed information via cross-attention. To benchmark FDSG, we introduce Scene Graph Forecasting, a new task for full future scene graph prediction. Experiments on Action Genome show that FDSG outperforms state-of-the-art methods on dynamic scene graph generation, scene graph anticipation, and scene graph forecasting. Codes will be released upon publication.
Related papers
- Towards Unbiased and Robust Spatio-Temporal Scene Graph Generation and Anticipation [10.678727237318503]
Real-world visual relationships often exhibit a long-tailed distribution, causing existing methods to produce biased scene graphs.<n>We propose Impar, a novel training framework that leverages loss masking and curriculum learning to mitigate bias generation.<n>Our curriculum-driven mask generation strategy further empowers the model to adaptively adjust its bias mitigation strategy over time, enabling more balanced and robust estimations.
arXiv Detail & Related papers (2024-11-20T06:15:28Z) - From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models [81.92098140232638]
Scene graph generation (SGG) aims to parse a visual scene into an intermediate graph representation for downstream reasoning tasks.
Existing methods struggle to generate scene graphs with novel visual relation concepts.
We introduce a new open-vocabulary SGG framework based on sequence generation.
arXiv Detail & Related papers (2024-04-01T04:21:01Z) - Local-Global Information Interaction Debiasing for Dynamic Scene Graph
Generation [51.92419880088668]
We propose a novel DynSGG model based on multi-task learning, DynSGG-MTL, which introduces the local interaction information and global human-action interaction information.
Long-temporal human actions supervise the model to generate multiple scene graphs that conform to the global constraints and avoid the model being unable to learn the tail predicates.
arXiv Detail & Related papers (2023-08-10T01:24:25Z) - Cross-Modality Time-Variant Relation Learning for Generating Dynamic
Scene Graphs [16.760066844287046]
We propose a Time-variant Relation-aware TRansformer (TR$2$) to model the temporal change of relations in dynamic scene graphs.
We show that TR$2$ significantly outperforms previous state-of-the-art methods under two different settings.
arXiv Detail & Related papers (2023-05-15T10:30:38Z) - Scene Graph Modification as Incremental Structure Expanding [61.84291817776118]
We focus on scene graph modification (SGM), where the system is required to learn how to update an existing scene graph based on a natural language query.
We frame SGM as a graph expansion task by introducing the incremental structure expanding (ISE)
We construct a challenging dataset that contains more complicated queries and larger scene graphs than existing datasets.
arXiv Detail & Related papers (2022-09-15T16:26:14Z) - Exploiting Long-Term Dependencies for Generating Dynamic Scene Graphs [15.614710220461353]
We show that capturing long-term dependencies is the key to effective generation of dynamic scene graphs.
Experimental results demonstrate that our Dynamic Scene Graph Detection Transformer (DSG-DETR) outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-12-18T03:02:11Z) - Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using
Scene Graphs [85.54212143154986]
Controllable scene synthesis consists of generating 3D information that satisfy underlying specifications.
Scene graphs are representations of a scene composed of objects (nodes) and inter-object relationships (edges)
We propose the first work that directly generates shapes from a scene graph in an end-to-end manner.
arXiv Detail & Related papers (2021-08-19T17:59:07Z) - Unconditional Scene Graph Generation [72.53624470737712]
We develop a deep auto-regressive model called SceneGraphGen which can learn the probability distribution over labelled and directed graphs.
We show that the scene graphs generated by SceneGraphGen are diverse and follow the semantic patterns of real-world scenes.
arXiv Detail & Related papers (2021-08-12T17:57:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.