Related papers: Scene Graph Modification Based on Natural Language Commands

Scene Graph Modification Based on Natural Language Commands

URL: http://arxiv.org/abs/2010.02591v1
Date: Tue, 6 Oct 2020 10:01:19 GMT
Title: Scene Graph Modification Based on Natural Language Commands
Authors: Xuanli He, Quan Hung Tran, Gholamreza Haffari, Walter Chang, Trung Bui, Zhe Lin, Franck Dernoncourt, Nhan Dam
Abstract summary: Structured representations like graphs and parse trees play a crucial role in many Natural Language Processing systems. In this paper, we explore the novel problem of graph modification, where the systems need to learn how to update an existing graph given a new user's command.
Score: 90.0662899539489
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Structured representations like graphs and parse trees play a crucial role in many Natural Language Processing systems. In recent years, the advancements in multi-turn user interfaces necessitate the need for controlling and updating these structured representations given new sources of information. Although there have been many efforts focusing on improving the performance of the parsers that map text to graphs or parse trees, very few have explored the problem of directly manipulating these representations. In this paper, we explore the novel problem of graph modification, where the systems need to learn how to update an existing scene graph given a new user's command. Our novel models based on graph-based sparse transformer and cross attention information fusion outperform previous systems adapted from the machine translation and graph generation literature. We further contribute our large graph modification datasets to the research community to encourage future research for this new problem.

Related papers

Graph Prompting for Graph Learning Models: Recent Advances and Future Directions [75.7773954442738]
"Pre-training, adaptation" scheme first pre-trains graph learning models on unlabeled graph data in a self-supervised manner.<n> graph prompting emerges as a promising approach that learns trainable prompts while keeping the pre-trained graph learning models unchanged.
arXiv Detail & Related papers (2025-06-10T01:27:19Z)
Query-Aware Learnable Graph Pooling Tokens as Prompt for Large Language Models [3.9489815622117566]
Learnable Graph Pooling Token (LGPT) enables flexible and efficient graph representation. Our method achieves a 4.13% performance improvement on the GraphQA benchmark without training the large language model.
arXiv Detail & Related papers (2025-01-29T10:35:41Z)
When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding and Reasoning [54.84870836443311]
The paper presents a new paradigm for understanding and reasoning about graph data by integrating image encoding and multimodal technologies. This approach enables the comprehension of graph data through an instruction-response format, utilizing GPT-4V's advanced capabilities. The study evaluates this paradigm on various graph types, highlighting the model's strengths and weaknesses, particularly in Chinese OCR performance and complex reasoning tasks.
arXiv Detail & Related papers (2023-12-16T08:14:11Z)
Deep Prompt Tuning for Graph Transformers [55.2480439325792]
Fine-tuning is resource-intensive and requires storing multiple copies of large models. We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning. By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
arXiv Detail & Related papers (2023-09-18T20:12:17Z)
Scene Graph Modification as Incremental Structure Expanding [61.84291817776118]
We focus on scene graph modification (SGM), where the system is required to learn how to update an existing scene graph based on a natural language query. We frame SGM as a graph expansion task by introducing the incremental structure expanding (ISE) We construct a challenging dataset that contains more complicated queries and larger scene graphs than existing datasets.
arXiv Detail & Related papers (2022-09-15T16:26:14Z)
GAP: A Graph-aware Language Model Framework for Knowledge Graph-to-Text Generation [3.593955557310285]
Recent improvements in KG-to-text generation are due to auxiliary pre-training tasks designed to give the fine-tune task a boost in performance. Here, we demonstrate that by fusing graph-aware elements into existing pre-trained language models, we are able to outperform state-of-the-art models and close the gap imposed by additional pre-training tasks.
arXiv Detail & Related papers (2022-04-13T23:53:37Z)
Stage-wise Fine-tuning for Graph-to-Text Generation [25.379346921398326]
Graph-to-text generation has benefited from pre-trained language models (PLMs) in achieving better performance than structured graph encoders. We propose a structured graph-to-text model with a two-step fine-tuning mechanism which first fine-tunes model on Wikipedia before adapting to the graph-to-text generation.
arXiv Detail & Related papers (2021-05-17T17:15:29Z)
GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph [53.70520466556453]
We propose GraphFormers, where layerwise GNN components are nested alongside the transformer blocks of language models. With the proposed architecture, the text encoding and the graph aggregation are fused into an iterative workflow. In addition, a progressive learning strategy is introduced, where the model is successively trained on manipulated data and original data to reinforce its capability of integrating information on graph.
arXiv Detail & Related papers (2021-05-06T12:20:41Z)
Promoting Graph Awareness in Linearized Graph-to-Text Generation [72.83863719868364]
We study the ability of linearized models to encode local graph structures. Our findings motivate solutions to enrich the quality of models' implicit graph encodings. We find that these denoising scaffolds lead to substantial improvements in downstream generation in low-resource settings.
arXiv Detail & Related papers (2020-12-31T18:17:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.