Scene Graph Modification Based on Natural Language Commands
- URL: http://arxiv.org/abs/2010.02591v1
- Date: Tue, 6 Oct 2020 10:01:19 GMT
- Title: Scene Graph Modification Based on Natural Language Commands
- Authors: Xuanli He, Quan Hung Tran, Gholamreza Haffari, Walter Chang, Trung
Bui, Zhe Lin, Franck Dernoncourt, Nhan Dam
- Abstract summary: Structured representations like graphs and parse trees play a crucial role in many Natural Language Processing systems.
In this paper, we explore the novel problem of graph modification, where the systems need to learn how to update an existing graph given a new user's command.
- Score: 90.0662899539489
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Structured representations like graphs and parse trees play a crucial role in
many Natural Language Processing systems. In recent years, the advancements in
multi-turn user interfaces necessitate the need for controlling and updating
these structured representations given new sources of information. Although
there have been many efforts focusing on improving the performance of the
parsers that map text to graphs or parse trees, very few have explored the
problem of directly manipulating these representations. In this paper, we
explore the novel problem of graph modification, where the systems need to
learn how to update an existing scene graph given a new user's command. Our
novel models based on graph-based sparse transformer and cross attention
information fusion outperform previous systems adapted from the machine
translation and graph generation literature. We further contribute our large
graph modification datasets to the research community to encourage future
research for this new problem.
Related papers
- When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding
and Reasoning [54.84870836443311]
The paper presents a new paradigm for understanding and reasoning about graph data by integrating image encoding and multimodal technologies.
This approach enables the comprehension of graph data through an instruction-response format, utilizing GPT-4V's advanced capabilities.
The study evaluates this paradigm on various graph types, highlighting the model's strengths and weaknesses, particularly in Chinese OCR performance and complex reasoning tasks.
arXiv Detail & Related papers (2023-12-16T08:14:11Z) - Deep Prompt Tuning for Graph Transformers [55.2480439325792]
Fine-tuning is resource-intensive and requires storing multiple copies of large models.
We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning.
By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
arXiv Detail & Related papers (2023-09-18T20:12:17Z) - Scene Graph Modification as Incremental Structure Expanding [61.84291817776118]
We focus on scene graph modification (SGM), where the system is required to learn how to update an existing scene graph based on a natural language query.
We frame SGM as a graph expansion task by introducing the incremental structure expanding (ISE)
We construct a challenging dataset that contains more complicated queries and larger scene graphs than existing datasets.
arXiv Detail & Related papers (2022-09-15T16:26:14Z) - GAP: A Graph-aware Language Model Framework for Knowledge Graph-to-Text
Generation [3.593955557310285]
Recent improvements in KG-to-text generation are due to auxiliary pre-training tasks designed to give the fine-tune task a boost in performance.
Here, we demonstrate that by fusing graph-aware elements into existing pre-trained language models, we are able to outperform state-of-the-art models and close the gap imposed by additional pre-training tasks.
arXiv Detail & Related papers (2022-04-13T23:53:37Z) - Stage-wise Fine-tuning for Graph-to-Text Generation [25.379346921398326]
Graph-to-text generation has benefited from pre-trained language models (PLMs) in achieving better performance than structured graph encoders.
We propose a structured graph-to-text model with a two-step fine-tuning mechanism which first fine-tunes model on Wikipedia before adapting to the graph-to-text generation.
arXiv Detail & Related papers (2021-05-17T17:15:29Z) - GraphFormers: GNN-nested Transformers for Representation Learning on
Textual Graph [53.70520466556453]
We propose GraphFormers, where layerwise GNN components are nested alongside the transformer blocks of language models.
With the proposed architecture, the text encoding and the graph aggregation are fused into an iterative workflow.
In addition, a progressive learning strategy is introduced, where the model is successively trained on manipulated data and original data to reinforce its capability of integrating information on graph.
arXiv Detail & Related papers (2021-05-06T12:20:41Z) - Promoting Graph Awareness in Linearized Graph-to-Text Generation [72.83863719868364]
We study the ability of linearized models to encode local graph structures.
Our findings motivate solutions to enrich the quality of models' implicit graph encodings.
We find that these denoising scaffolds lead to substantial improvements in downstream generation in low-resource settings.
arXiv Detail & Related papers (2020-12-31T18:17:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.