Improving Semantic Matching through Dependency-Enhanced Pre-trained
Model with Adaptive Fusion
- URL: http://arxiv.org/abs/2210.08471v5
- Date: Thu, 24 Aug 2023 07:13:27 GMT
- Title: Improving Semantic Matching through Dependency-Enhanced Pre-trained
Model with Adaptive Fusion
- Authors: Jian Song, Di Liang, Rumei Li, Yuntao Li, Sirui Wang, Minlong Peng,
Wei Wu, Yongxin Yu
- Abstract summary: We propose textbfDependency-Enhanced textbfAdaptive textbfFusion textbfAttention (textbfDAFA).
It explicitly introduces dependency structure into pre-trained models and adaptively fuses it with semantic information.
By applying it on BERT, our method achieves state-of-the-art or competitive performance on 10 public datasets.
- Score: 23.00381824485556
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer-based pre-trained models like BERT have achieved great progress
on Semantic Sentence Matching. Meanwhile, dependency prior knowledge has also
shown general benefits in multiple NLP tasks. However, how to efficiently
integrate dependency prior structure into pre-trained models to better model
complex semantic matching relations is still unsettled. In this paper, we
propose the \textbf{D}ependency-Enhanced \textbf{A}daptive \textbf{F}usion
\textbf{A}ttention (\textbf{DAFA}), which explicitly introduces dependency
structure into pre-trained models and adaptively fuses it with semantic
information. Specifically, \textbf{\emph{(i)}} DAFA first proposes a
structure-sensitive paradigm to construct a dependency matrix for calibrating
attention weights. It adopts an adaptive fusion module to integrate the
obtained dependency information and the original semantic signals. Moreover,
DAFA reconstructs the attention calculation flow and provides better
interpretability. By applying it on BERT, our method achieves state-of-the-art
or competitive performance on 10 public datasets, demonstrating the benefits of
adaptively fusing dependency structure in semantic matching task.
Related papers
- Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization [75.1240295759264]
We propose an effective framework for Bridging and Modeling Correlations in pairwise data, named BMC.
We increase the consistency and informativeness of the pairwise preference signals through targeted modifications.
We identify that DPO alone is insufficient to model these correlations and capture nuanced variations.
arXiv Detail & Related papers (2024-08-14T11:29:47Z) - A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap [50.079224604394]
We present a novel model-agnostic framework called textbfContext-textbfEnhanced textbfFeature textbfAment (CEFA)
CEFA consists of a feature alignment module and a context enhancement module.
Our method can serve as a plug-and-play module to improve the detection performance of HOI models on rare categories.
arXiv Detail & Related papers (2024-07-31T08:42:48Z) - FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model [76.509837704596]
We propose FASTopic, a fast, adaptive, stable, and transferable topic model.
We use Dual Semantic-relation Reconstruction (DSR) to model latent topics.
We also propose Embedding Transport Plan (ETP) to regularize semantic relations as optimal transport plans.
arXiv Detail & Related papers (2024-05-28T09:06:38Z) - Best of Both Worlds: A Pliable and Generalizable Neuro-Symbolic Approach
for Relation Classification [17.398872494876365]
This paper introduces a novel neuro-symbolic architecture for relation classification (RC)
It combines rule-based methods with contemporary deep learning techniques.
We show that our proposed method outperforms previous state-of-the-art models in three out of four settings.
arXiv Detail & Related papers (2024-03-05T20:08:32Z) - Semi-automatic Data Enhancement for Document-Level Relation Extraction
with Distant Supervision from Large Language Models [26.523153535336725]
Document-level Relation Extraction (DocRE) aims to extract relations from a long context.
We propose a method integrating a large language model (LLM) and a natural language inference (NLI) module to generate relation triples.
We demonstrate the effectiveness of our approach by introducing an enhanced dataset known as DocGNRE.
arXiv Detail & Related papers (2023-11-13T13:10:44Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - A Novel Few-Shot Relation Extraction Pipeline Based on Adaptive
Prototype Fusion [5.636675879040131]
Few-shot relation extraction (FSRE) aims at recognizing unseen relations by learning with merely a handful of annotated instances.
This paper proposes a novel pipeline for the FSRE task based on adaptive prototype fusion.
Experiments on the benchmark dataset FewRel 1.0 show a significant improvement of our method against state-of-the-art methods.
arXiv Detail & Related papers (2022-10-15T09:44:21Z) - Enhancing Pre-trained Models with Text Structure Knowledge for Question
Generation [2.526624977753083]
We model text structure as answer position and syntactic dependency, and propose answer localness modeling and syntactic mask attention to address these limitations.
Experiments on SQuAD dataset show that our proposed two modules improve performance over the strong pre-trained model ProphetNet.
arXiv Detail & Related papers (2022-09-09T08:33:47Z) - Generative Relation Linking for Question Answering over Knowledge Bases [12.778133758613773]
We propose a novel approach for relation linking framing it as a generative problem.
We extend such sequence-to-sequence models with the idea of infusing structured data from the target knowledge base.
We train the model with the aim to generate a structured output consisting of a list of argument-relation pairs, enabling a knowledge validation step.
arXiv Detail & Related papers (2021-08-16T20:33:43Z) - Syntax-Enhanced Pre-trained Model [49.1659635460369]
We study the problem of leveraging the syntactic structure of text to enhance pre-trained models such as BERT and RoBERTa.
Existing methods utilize syntax of text either in the pre-training stage or in the fine-tuning stage, so that they suffer from discrepancy between the two stages.
We present a model that utilizes the syntax of text in both pre-training and fine-tuning stages.
arXiv Detail & Related papers (2020-12-28T06:48:04Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.