Related papers: Improving Semantic Matching through Dependency-Enhanced Pre-trained Model with Adaptive Fusion

Improving Semantic Matching through Dependency-Enhanced Pre-trained Model with Adaptive Fusion

URL: http://arxiv.org/abs/2210.08471v5
Date: Thu, 24 Aug 2023 07:13:27 GMT
Title: Improving Semantic Matching through Dependency-Enhanced Pre-trained Model with Adaptive Fusion
Authors: Jian Song, Di Liang, Rumei Li, Yuntao Li, Sirui Wang, Minlong Peng, Wei Wu, Yongxin Yu
Abstract summary: We propose textbfDependency-Enhanced textbfAdaptive textbfFusion textbfAttention (textbfDAFA). It explicitly introduces dependency structure into pre-trained models and adaptively fuses it with semantic information. By applying it on BERT, our method achieves state-of-the-art or competitive performance on 10 public datasets.
Score: 23.00381824485556
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transformer-based pre-trained models like BERT have achieved great progress on Semantic Sentence Matching. Meanwhile, dependency prior knowledge has also shown general benefits in multiple NLP tasks. However, how to efficiently integrate dependency prior structure into pre-trained models to better model complex semantic matching relations is still unsettled. In this paper, we propose the \textbf{D}ependency-Enhanced \textbf{A}daptive \textbf{F}usion \textbf{A}ttention (\textbf{DAFA}), which explicitly introduces dependency structure into pre-trained models and adaptively fuses it with semantic information. Specifically, \textbf{\emph{(i)}} DAFA first proposes a structure-sensitive paradigm to construct a dependency matrix for calibrating attention weights. It adopts an adaptive fusion module to integrate the obtained dependency information and the original semantic signals. Moreover, DAFA reconstructs the attention calculation flow and provides better interpretability. By applying it on BERT, our method achieves state-of-the-art or competitive performance on 10 public datasets, demonstrating the benefits of adaptively fusing dependency structure in semantic matching task.

Related papers

Integrating Textual Embeddings from Contrastive Learning with Generative Recommender for Enhanced Personalization [8.466223794246261]
We propose a hybrid framework that augments the generative recommender with contrastive text embedding model. We evaluate our method on two domains from the Amazon Reviews 2023 dataset.
arXiv Detail & Related papers (2025-04-13T15:23:00Z)
Knowledge Graph Completion with Relation-Aware Anchor Enhancement [50.50944396454757]
We propose a relation-aware anchor enhanced knowledge graph completion method (RAA-KGC) We first generate anchor entities within the relation-aware neighborhood of the head entity. Then, by pulling the query embedding towards the neighborhoods of the anchors, it is tuned to be more discriminative for target entity matching.
arXiv Detail & Related papers (2025-04-08T15:22:08Z)
Towards Transformer-Based Aligned Generation with Self-Coherence Guidance [51.42269790543461]
We introduce a training-free approach for enhancing alignment in Transformer-based Text-Guided Diffusion Models (TGDMs) Existing TGDMs often struggle to generate semantically aligned images, particularly when dealing with complex text prompts or multi-concept attribute binding challenges. Our method addresses these challenges by directly optimizing cross-attention maps during the generation process.
arXiv Detail & Related papers (2025-03-22T07:03:57Z)
Dependency Parsing with the Structuralized Prompt Template [14.547116901025506]
Dependency parsing is a fundamental task in natural language processing (NLP) We propose a novel dependency parsing method that relies solely on an encoder model with a text-to-text training approach. Our experimental results demonstrate that the proposed method achieves outstanding performance compared to traditional models.
arXiv Detail & Related papers (2025-02-24T07:25:10Z)
Structural Embedding Projection for Contextual Large Language Model Inference [0.0]
Structured embedding transformations offer a promising approach for enhancing the efficiency and coherence of language model inference. The mathematical formulation of Structural Embedding Projection (SEP) enables embedding spaces to capture structured contextual relationships. The impact of SEP on lexical diversity suggested that embedding modifications influenced the model's vocabulary usage.
arXiv Detail & Related papers (2025-01-31T00:46:21Z)
VaeDiff-DocRE: End-to-end Data Augmentation Framework for Document-level Relation Extraction [9.516897428263146]
Document-level Relation Extraction (DocRE) aims to identify relationships between entity pairs within a document. Most existing methods assume a uniform label distribution, resulting in suboptimal performance on real-world, imbalanced datasets. We propose a novel data augmentation approach using generative models to enhance data from the embedding space.
arXiv Detail & Related papers (2024-12-18T04:55:29Z)
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization [75.1240295759264]
We propose an effective framework for Bridging and Modeling Correlations in pairwise data, named BMC. We increase the consistency and informativeness of the pairwise preference signals through targeted modifications. We identify that DPO alone is insufficient to model these correlations and capture nuanced variations.
arXiv Detail & Related papers (2024-08-14T11:29:47Z)
A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap [50.079224604394]
We present a novel model-agnostic framework called textbfContext-textbfEnhanced textbfFeature textbfAment (CEFA) CEFA consists of a feature alignment module and a context enhancement module. Our method can serve as a plug-and-play module to improve the detection performance of HOI models on rare categories.
arXiv Detail & Related papers (2024-07-31T08:42:48Z)
FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model [76.509837704596]
We propose FASTopic, a fast, adaptive, stable, and transferable topic model. We use Dual Semantic-relation Reconstruction (DSR) to model latent topics. We also propose Embedding Transport Plan (ETP) to regularize semantic relations as optimal transport plans.
arXiv Detail & Related papers (2024-05-28T09:06:38Z)
Best of Both Worlds: A Pliable and Generalizable Neuro-Symbolic Approach for Relation Classification [17.398872494876365]
This paper introduces a novel neuro-symbolic architecture for relation classification (RC) It combines rule-based methods with contemporary deep learning techniques. We show that our proposed method outperforms previous state-of-the-art models in three out of four settings.
arXiv Detail & Related papers (2024-03-05T20:08:32Z)
Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models [26.523153535336725]
Document-level Relation Extraction (DocRE) aims to extract relations from a long context. We propose a method integrating a large language model (LLM) and a natural language inference (NLI) module to generate relation triples. We demonstrate the effectiveness of our approach by introducing an enhanced dataset known as DocGNRE.
arXiv Detail & Related papers (2023-11-13T13:10:44Z)
Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs. Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z)
A Novel Few-Shot Relation Extraction Pipeline Based on Adaptive Prototype Fusion [5.636675879040131]
Few-shot relation extraction (FSRE) aims at recognizing unseen relations by learning with merely a handful of annotated instances. This paper proposes a novel pipeline for the FSRE task based on adaptive prototype fusion. Experiments on the benchmark dataset FewRel 1.0 show a significant improvement of our method against state-of-the-art methods.
arXiv Detail & Related papers (2022-10-15T09:44:21Z)
Enhancing Pre-trained Models with Text Structure Knowledge for Question Generation [2.526624977753083]
We model text structure as answer position and syntactic dependency, and propose answer localness modeling and syntactic mask attention to address these limitations. Experiments on SQuAD dataset show that our proposed two modules improve performance over the strong pre-trained model ProphetNet.
arXiv Detail & Related papers (2022-09-09T08:33:47Z)
Generative Relation Linking for Question Answering over Knowledge Bases [12.778133758613773]
We propose a novel approach for relation linking framing it as a generative problem. We extend such sequence-to-sequence models with the idea of infusing structured data from the target knowledge base. We train the model with the aim to generate a structured output consisting of a list of argument-relation pairs, enabling a knowledge validation step.
arXiv Detail & Related papers (2021-08-16T20:33:43Z)
Syntax-Enhanced Pre-trained Model [49.1659635460369]
We study the problem of leveraging the syntactic structure of text to enhance pre-trained models such as BERT and RoBERTa. Existing methods utilize syntax of text either in the pre-training stage or in the fine-tuning stage, so that they suffer from discrepancy between the two stages. We present a model that utilizes the syntax of text in both pre-training and fine-tuning stages.
arXiv Detail & Related papers (2020-12-28T06:48:04Z)
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. VAEs tend to ignore latent variables with a strong auto-regressive decoder. We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.