Adaptive Visual Scene Understanding: Incremental Scene Graph Generation
- URL: http://arxiv.org/abs/2310.01636v2
- Date: Wed, 11 Oct 2023 02:02:48 GMT
- Title: Adaptive Visual Scene Understanding: Incremental Scene Graph Generation
- Authors: Naitik Khandelwal, Xiao Liu and Mengmi Zhang
- Abstract summary: Scene graph generation (SGG) involves analyzing images to extract meaningful information about objects and their relationships.
To address the lack of continual learning methodologies in SGG, we introduce the comprehensive Continual ScenE Graph Generation dataset.
- Score: 20.255178648494756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scene graph generation (SGG) involves analyzing images to extract meaningful
information about objects and their relationships. Given the dynamic nature of
the visual world, it becomes crucial for AI systems to detect new objects and
establish their new relationships with existing objects. To address the lack of
continual learning methodologies in SGG, we introduce the comprehensive
Continual ScenE Graph Generation (CSEGG) dataset along with 3 learning
scenarios and 8 evaluation metrics. Our research investigates the continual
learning performances of existing SGG methods on the retention of previous
object entities and relationships as they learn new ones. Moreover, we also
explore how continual object detection enhances generalization in classifying
known relationships on unknown objects. We conduct extensive experiments
benchmarking and analyzing the classical two-stage SGG methods and the most
recent transformer-based SGG methods in continual learning settings, and gain
valuable insights into the CSEGG problem. We invite the research community to
explore this emerging field of study.
Related papers
- Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency [3.351553095054309]
Scene graph generation (SGG) represents the relationships between objects in an image as a graph structure.
Previous studies have failed to reflect the co-occurrence of objects during SGG generation.
We propose CooK, which reflects the Co-occurrence Knowledge between objects, and the learnable term frequency-inverse document frequency.
arXiv Detail & Related papers (2024-05-21T09:56:48Z) - Towards Lifelong Scene Graph Generation with Knowledge-ware In-context
Prompt Learning [24.98058940030532]
Scene graph generation (SGG) endeavors to predict visual relationships between pairs of objects within an image.
This work seeks to address the pitfall inherent in a suite of prior relationship predictions.
Motivated by the achievements of in-context learning in pretrained language models, our approach imbues the model with the capability to predict relationships.
arXiv Detail & Related papers (2024-01-26T03:43:22Z) - Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph
Generation via Visual-Concept Alignment and Retention [74.42036028592705]
Scene Graph Generation (SGG) offers a structured representation critical in many computer vision applications.
We propose a unified framework named OvSGTR towards fully open vocabulary SGG from a holistic view.
For the more challenging settings of relation-involved open vocabulary SGG, the proposed approach integrates relation-aware pre-training utilizing image-caption data.
arXiv Detail & Related papers (2023-11-18T06:49:17Z) - Towards a Unified Transformer-based Framework for Scene Graph Generation
and Human-object Interaction Detection [116.21529970404653]
We introduce SG2HOI+, a unified one-step model based on the Transformer architecture.
Our approach employs two interactive hierarchical Transformers to seamlessly unify the tasks of SGG and HOI detection.
Our approach achieves competitive performance when compared to state-of-the-art HOI methods.
arXiv Detail & Related papers (2023-11-03T07:25:57Z) - Semantic Scene Graph Generation Based on an Edge Dual Scene Graph and
Message Passing Neural Network [3.9280441311534653]
Scene graph generation (SGG) captures the relationships between objects in an image and creates a structured graph-based representation.
Existing SGG methods have a limited ability to accurately predict detailed relationships.
A new approach to the modeling multiobject relationships, called edge dual scene graph generation (EdgeSGG), is proposed herein.
arXiv Detail & Related papers (2023-11-02T12:36:52Z) - Towards Open-vocabulary Scene Graph Generation with Prompt-based
Finetuning [84.39787427288525]
Scene graph generation (SGG) is a fundamental task aimed at detecting visual relations between objects in an image.
We introduce open-vocabulary scene graph generation, a novel, realistic and challenging setting in which a model is trained on a set of base object classes.
Our method can support inference over completely unseen object classes, which existing methods are incapable of handling.
arXiv Detail & Related papers (2022-08-17T09:05:38Z) - Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey [71.10448142010422]
Multi-object tracking (MOT) aims to associate target objects across video frames in order to obtain entire moving trajectories.
Embedding methods play an essential role in object location estimation and temporal identity association in MOT.
We first conduct a comprehensive overview with in-depth analysis for embedding methods in MOT from seven different perspectives.
arXiv Detail & Related papers (2022-05-22T06:54:33Z) - Relation Regularized Scene Graph Generation [206.76762860019065]
Scene graph generation (SGG) is built on top of detected objects to predict object pairwise visual relations.
We propose a relation regularized network (R2-Net) which can predict whether there is a relationship between two objects.
Our R2-Net can effectively refine object labels and generate scene graphs.
arXiv Detail & Related papers (2022-02-22T11:36:49Z) - Scene Graph Generation: A Comprehensive Survey [35.80909746226258]
Scene graph has been the focus of research because of its powerful semantic representation and applications to scene understanding.
Scene Graph Generation (SGG) refers to the task of automatically mapping an image into a semantic structural scene graph.
We review 138 representative works that cover different input modalities, and systematically summarize existing methods of image-based SGG.
arXiv Detail & Related papers (2022-01-03T00:55:33Z) - Exploiting Scene Graphs for Human-Object Interaction Detection [81.49184987430333]
Human-Object Interaction (HOI) detection is a fundamental visual task aiming at localizing and recognizing interactions between humans and objects.
We propose a novel method to exploit this information, through the scene graph, for the Human-Object Interaction (SG2HOI) detection task.
Our method, SG2HOI, incorporates the SG information in two ways: (1) we embed a scene graph into a global context clue, serving as the scene-specific environmental context; and (2) we build a relation-aware message-passing module to gather relationships from objects' neighborhood and transfer them into interactions.
arXiv Detail & Related papers (2021-08-19T09:40:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.