Related papers: Make Graph-based Referring Expression Comprehension Great Again through Expression-guided Dynamic Gating and Regression

Make Graph-based Referring Expression Comprehension Great Again through Expression-guided Dynamic Gating and Regression

URL: http://arxiv.org/abs/2409.03385v1
Date: Thu, 5 Sep 2024 09:44:43 GMT
Title: Make Graph-based Referring Expression Comprehension Great Again through Expression-guided Dynamic Gating and Regression
Authors: Jingcheng Ke, Dele Wang, Jun-Cheng Chen, I-Hong Jhuo, Chia-Wen Lin, Yen-Yu Lin,
Abstract summary: We introduce a plug-and-adapt module guided by sub-expressions, called dynamic gate constraint (DGC), which can adaptively disable irrelevant proposals during reasoning. We also introduce an expression-guided regression strategy (EGR) to refine location prediction. Without any pretaining, the proposed graph-based method achieves better performance than the state-of-the-art (SOTA) transformer-based methods.
Score: 44.36417883611282
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: One common belief is that with complex models and pre-training on large-scale datasets, transformer-based methods for referring expression comprehension (REC) perform much better than existing graph-based methods. We observe that since most graph-based methods adopt an off-the-shelf detector to locate candidate objects (i.e., regions detected by the object detector), they face two challenges that result in subpar performance: (1) the presence of significant noise caused by numerous irrelevant objects during reasoning, and (2) inaccurate localization outcomes attributed to the provided detector. To address these issues, we introduce a plug-and-adapt module guided by sub-expressions, called dynamic gate constraint (DGC), which can adaptively disable irrelevant proposals and their connections in graphs during reasoning. We further introduce an expression-guided regression strategy (EGR) to refine location prediction. Extensive experimental results on the RefCOCO, RefCOCO+, RefCOCOg, Flickr30K, RefClef, and Ref-reasoning datasets demonstrate the effectiveness of the DGC module and the EGR strategy in consistently boosting the performances of various graph-based REC methods. Without any pretaining, the proposed graph-based method achieves better performance than the state-of-the-art (SOTA) transformer-based methods.

Related papers

Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning [20.05893083101089]
Graph-R1 is an agentic GraphRAG framework via end-to-end reinforcement learning (RL)<n>It introduces lightweight knowledge hypergraph construction, models retrieval as a multi-turn agent-environment interaction.<n>Experiments on standard RAG datasets show that Graph-R1 outperforms traditional GraphRAG and RL-enhanced RAG methods in reasoning accuracy, retrieval efficiency, and generation quality.
arXiv Detail & Related papers (2025-07-29T15:01:26Z)
Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering [75.12322966980003]
Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains.<n>Most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning.<n>Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering.<n>We propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA.
arXiv Detail & Related papers (2025-06-11T12:03:52Z)
Graph-based Diffusion Model for Collaborative Filtering [10.654721251152187]
We propose a Graph-based Diffusion Model for Collaborative Filtering (GDMCF) GDMCF consistently outperforms state-of-the-art methods, highlighting its effectiveness in capturing higher-order collaborative signals.
arXiv Detail & Related papers (2025-04-07T12:51:18Z)
RGL: A Graph-Centric, Modular Framework for Efficient Retrieval-Augmented Generation on Graphs [58.10503898336799]
We introduce the RAG-on-Graphs Library (RGL), a modular framework that seamlessly integrates the complete RAG pipeline. RGL addresses key challenges by supporting a variety of graph formats and integrating optimized implementations for essential components. Our evaluations demonstrate that RGL not only accelerates the prototyping process but also enhances the performance and applicability of graph-based RAG systems.
arXiv Detail & Related papers (2025-03-25T03:21:48Z)
Chain-of-Retrieval Augmented Generation [72.06205327186069]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer. Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z)
Graph Structure Refinement with Energy-based Contrastive Learning [56.957793274727514]
We introduce an unsupervised method based on a joint of generative training and discriminative training to learn graph structure and representation. We propose an Energy-based Contrastive Learning (ECL) guided Graph Structure Refinement (GSR) framework, denoted as ECL-GSR. ECL-GSR achieves faster training with fewer samples and memories against the leading baseline, highlighting its simplicity and efficiency in downstream tasks.
arXiv Detail & Related papers (2024-12-20T04:05:09Z)
Explanation-Preserving Augmentation for Semi-Supervised Graph Representation Learning [13.494832603509897]
Graph representation learning (GRL) has emerged as an effective technique achieving performance improvements in wide tasks such as node classification and graph classification. We propose a novel method, Explanation-Preserving Augmentation (EPA), that leverages graph explanation techniques for generating augmented graphs. EPA first uses a small number of labels to train a graph explainer to infer the sub-structures (explanations) that are most relevant to a graph's semantics.
arXiv Detail & Related papers (2024-10-16T15:18:03Z)
Disentangled Generative Graph Representation Learning [51.59824683232925]
This paper introduces DiGGR (Disentangled Generative Graph Representation Learning), a self-supervised learning framework. It aims to learn latent disentangled factors and utilize them to guide graph mask modeling. Experiments on 11 public datasets for two different graph learning tasks demonstrate that DiGGR consistently outperforms many previous self-supervised methods.
arXiv Detail & Related papers (2024-08-24T05:13:02Z)
Amplify Graph Learning for Recommendation via Sparsity Completion [16.32861024767423]
Graph learning models have been widely deployed in collaborative filtering (CF) based recommendation systems. Due to the issue of data sparsity, the graph structure of the original input lacks potential positive preference edges. We propose an Amplify Graph Learning framework based on Sparsity Completion (called AGL-SC)
arXiv Detail & Related papers (2024-06-27T08:26:20Z)
PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly Detection [65.24854366973794]
Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in domains such as medicine, social networks, and e-commerce. We introduce a simple method termed PREprocessing and Matching (PREM for short) to improve the efficiency of GAD. Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities.
arXiv Detail & Related papers (2023-10-18T02:59:57Z)
Whether you can locate or not? Interactive Referring Expression Generation [12.148963878497243]
We propose an Interactive REG (IREG) model that can interact with a real REC model. IREG outperforms previous state-of-the-art methods on popular evaluation metrics.
arXiv Detail & Related papers (2023-08-19T10:53:32Z)
Localized Contrastive Learning on Graphs [110.54606263711385]
We introduce a simple yet effective contrastive model named Localized Graph Contrastive Learning (Local-GCL) In spite of its simplicity, Local-GCL achieves quite competitive performance in self-supervised node representation learning tasks on graphs with various scales and properties.
arXiv Detail & Related papers (2022-12-08T23:36:00Z)
GFlowCausal: Generative Flow Networks for Causal Discovery [27.51595081346858]
We propose a novel approach to learning a Directed Acyclic Graph (DAG) from observational data called GFlowCausal. GFlowCausal aims to learn the best policy to generate high-reward DAGs by sequential actions with probabilities proportional to predefined rewards. We conduct extensive experiments on both synthetic and real datasets, and results show the proposed approach to be superior and also performs well in a large-scale setting.
arXiv Detail & Related papers (2022-10-15T04:07:39Z)
Adaptive Fine-Grained Predicates Learning for Scene Graph Generation [122.4588401267544]
General Scene Graph Generation (SGG) models tend to predict head predicates and re-balancing strategies prefer tail categories. We propose an Adaptive Fine-Grained Predicates Learning (FGPL-A) which aims at differentiating hard-to-distinguish predicates for SGG. Our proposed model-agnostic strategy significantly boosts performance of benchmark models on VG-SGG and GQA-SGG datasets by up to 175% and 76% on Mean Recall@100, achieving new state-of-the-art performance.
arXiv Detail & Related papers (2022-07-11T03:37:57Z)
Dynamic Graph Representation Learning via Graph Transformer Networks [41.570839291138114]
We propose a Transformer-based dynamic graph learning method named Dynamic Graph Transformer (DGT) DGT has spatial-temporal encoding to effectively learn graph topology and capture implicit links. We show that DGT presents superior performance compared with several state-of-the-art baselines.
arXiv Detail & Related papers (2021-11-19T21:44:23Z)
Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text Generation [56.73834525802723]
Lightweight Dynamic Graph Convolutional Networks (LDGCNs) are proposed. LDGCNs capture richer non-local interactions by synthesizing higher order information from the input graphs. We develop two novel parameter saving strategies based on the group graph convolutions and weight tied convolutions to reduce memory usage and model complexity.
arXiv Detail & Related papers (2020-10-09T06:03:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.