Graph Edit Distance Reward: Learning to Edit Scene Graph
- URL: http://arxiv.org/abs/2008.06651v1
- Date: Sat, 15 Aug 2020 04:52:16 GMT
- Title: Graph Edit Distance Reward: Learning to Edit Scene Graph
- Authors: Lichang Chen, Guosheng Lin, Shijie Wang, Qingyao Wu
- Abstract summary: We propose a new method to edit the scene graph according to the user instructions, which has never been explored.
To be specific, in order to learn editing scene graphs as the semantics given by texts, we propose a Graph Edit Distance Reward.
In the context of text-editing image retrieval, we validate the effectiveness of our method in CSS and CRIR dataset.
- Score: 69.39048809061714
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scene Graph, as a vital tool to bridge the gap between language domain and
image domain, has been widely adopted in the cross-modality task like VQA. In
this paper, we propose a new method to edit the scene graph according to the
user instructions, which has never been explored. To be specific, in order to
learn editing scene graphs as the semantics given by texts, we propose a Graph
Edit Distance Reward, which is based on the Policy Gradient and Graph Matching
algorithm, to optimize neural symbolic model. In the context of text-editing
image retrieval, we validate the effectiveness of our method in CSS and CRIR
dataset. Besides, CRIR is a new synthetic dataset generated by us, which we
will publish it soon for future use.
Related papers
- From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models [81.92098140232638]
Scene graph generation (SGG) aims to parse a visual scene into an intermediate graph representation for downstream reasoning tasks.
Existing methods struggle to generate scene graphs with novel visual relation concepts.
We introduce a new open-vocabulary SGG framework based on sequence generation.
arXiv Detail & Related papers (2024-04-01T04:21:01Z) - EPIC: Graph Augmentation with Edit Path Interpolation via Learnable Cost [12.191001329584502]
We propose EPIC (Edit Path Interpolation via learnable Cost), a novel-based method for augmenting graph datasets.
To interpolate between two graphs lying in an irregular domain, EPIC builds an edit path that represents the transformation process between two graphs via edit operations.
Our approach outperforms existing augmentation techniques in many tasks.
arXiv Detail & Related papers (2023-06-02T07:19:07Z) - iEdit: Localised Text-guided Image Editing with Weak Supervision [53.082196061014734]
We propose a novel learning method for text-guided image editing.
It generates images conditioned on a source image and a textual edit prompt.
It shows favourable results against its counterparts in terms of image fidelity, CLIP alignment score and qualitatively for editing both generated and real images.
arXiv Detail & Related papers (2023-05-10T07:39:14Z) - Diffusion-Based Scene Graph to Image Generation with Masked Contrastive
Pre-Training [112.94542676251133]
We propose to learn scene graph embeddings by directly optimizing their alignment with images.
Specifically, we pre-train an encoder to extract both global and local information from scene graphs.
The resulting method, called SGDiff, allows for the semantic manipulation of generated images by modifying scene graph nodes and connections.
arXiv Detail & Related papers (2022-11-21T01:11:19Z) - Scene Graph Modification as Incremental Structure Expanding [61.84291817776118]
We focus on scene graph modification (SGM), where the system is required to learn how to update an existing scene graph based on a natural language query.
We frame SGM as a graph expansion task by introducing the incremental structure expanding (ISE)
We construct a challenging dataset that contains more complicated queries and larger scene graphs than existing datasets.
arXiv Detail & Related papers (2022-09-15T16:26:14Z) - Learning to Generate Scene Graph from Natural Language Supervision [52.18175340725455]
We propose one of the first methods that learn from image-sentence pairs to extract a graphical representation of localized objects and their relationships within an image, known as scene graph.
We leverage an off-the-shelf object detector to identify and localize object instances, match labels of detected regions to concepts parsed from captions, and thus create "pseudo" labels for learning scene graph.
arXiv Detail & Related papers (2021-09-06T03:38:52Z) - A Neural Edge-Editing Approach for Document-Level Relation Graph
Extraction [9.449257113935461]
We treat relations in a document as a relation graph among entities.
The relation graph is iteratively constructed by editing edges of an initial graph.
The way to edit edges is to classify them in a close-first manner.
arXiv Detail & Related papers (2021-06-18T03:46:49Z) - RTIC: Residual Learning for Text and Image Composition using Graph
Convolutional Network [19.017377597937617]
We study the compositional learning of images and texts for image retrieval.
We introduce a novel method that combines the graph convolutional network (GCN) with existing composition methods.
arXiv Detail & Related papers (2021-04-07T09:41:52Z) - Learning Graph Edit Distance by Graph Neural Networks [3.002973807612758]
We propose a new framework able to combine the advances on deep metric learning with traditional approximations of the graph edit distance.
Our method employs a message passing neural network to capture the graph structure, and thus, leveraging this information for its use on a distance computation.
arXiv Detail & Related papers (2020-08-17T21:49:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.