LadRa-Net: Locally-Aware Dynamic Re-read Attention Net for Sentence
Semantic Matching
- URL: http://arxiv.org/abs/2108.02915v1
- Date: Fri, 6 Aug 2021 02:07:04 GMT
- Title: LadRa-Net: Locally-Aware Dynamic Re-read Attention Net for Sentence
Semantic Matching
- Authors: Kun Zhang, Guangyi Lv, Le Wu, Enhong Chen, Qi Liu, Meng Wang
- Abstract summary: We develop a novel Dynamic Re-read Network (DRr-Net) for sentence semantic matching.
We extend DRr-Net to Locally-Aware Dynamic Re-read Attention Net (LadRa-Net)
Experiments on two popular sentence semantic matching tasks demonstrate that DRr-Net can significantly improve the performance of sentence semantic matching.
- Score: 66.65398852962177
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sentence semantic matching requires an agent to determine the semantic
relation between two sentences, which is widely used in various natural
language tasks, such as Natural Language Inference (NLI), Paraphrase
Identification (PI), and so on. Much recent progress has been made in this
area, especially attention-based methods and pre-trained language model based
methods. However, most of these methods focus on all the important parts in
sentences in a static way and only emphasize how important the words are to the
query, inhibiting the ability of attention mechanism. In order to overcome this
problem and boost the performance of attention mechanism, we propose a novel
dynamic re-read attention, which can pay close attention to one small region of
sentences at each step and re-read the important parts for better sentence
representations. Based on this attention variation, we develop a novel Dynamic
Re-read Network (DRr-Net) for sentence semantic matching. Moreover, selecting
one small region in dynamic re-read attention seems insufficient for sentence
semantics, and employing pre-trained language models as input encoders will
introduce incomplete and fragile representation problems. To this end, we
extend DRrNet to Locally-Aware Dynamic Re-read Attention Net (LadRa-Net), in
which local structure of sentences is employed to alleviate the shortcoming of
Byte-Pair Encoding (BPE) in pre-trained language models and boost the
performance of dynamic reread attention. Extensive experiments on two popular
sentence semantic matching tasks demonstrate that DRr-Net can significantly
improve the performance of sentence semantic matching. Meanwhile, LadRa-Net is
able to achieve better performance by considering the local structures of
sentences. In addition, it is exceedingly interesting that some discoveries in
our experiments are consistent with some findings of psychological research.
Related papers
- Topic-DPR: Topic-based Prompts for Dense Passage Retrieval [6.265789210037749]
We present Topic-DPR, a dense passage retrieval model that uses topic-based prompts.
We introduce a novel positive and negative sampling strategy, leveraging semi-structured data to boost dense retrieval efficiency.
arXiv Detail & Related papers (2023-10-10T13:45:24Z) - Sentence Representation Learning with Generative Objective rather than
Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - DGA-Net Dynamic Gaussian Attention Network for Sentence Semantic
Matching [52.661387170698255]
We propose a novel Dynamic Gaussian Attention Network (DGA-Net) to improve attention mechanism.
We first leverage pre-trained language model to encode the input sentences and construct semantic representations from a global perspective.
Finally, we develop a Dynamic Gaussian Attention (DGA) to dynamically capture the important parts and corresponding local contexts from a detailed perspective.
arXiv Detail & Related papers (2021-06-09T08:43:04Z) - Improving BERT with Syntax-aware Local Attention [14.70545694771721]
We propose a syntax-aware local attention, where the attention scopes are based on the distances in the syntactic structure.
We conduct experiments on various single-sentence benchmarks, including sentence classification and sequence labeling tasks.
Our model achieves better performance owing to more focused attention over syntactically relevant words.
arXiv Detail & Related papers (2020-12-30T13:29:58Z) - Narrative Incoherence Detection [76.43894977558811]
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding.
Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
arXiv Detail & Related papers (2020-12-21T07:18:08Z) - Sequential Sentence Matching Network for Multi-turn Response Selection
in Retrieval-based Chatbots [45.920841134523286]
We propose a matching network, called sequential sentence matching network (S2M), to use the sentence-level semantic information to address the problem.
Firstly, we find that by using the sentence-level semantic information, the network successfully addresses the problem and gets a significant improvement on matching, resulting in a state-of-the-art performance.
arXiv Detail & Related papers (2020-05-16T09:47:19Z) - Salience Estimation with Multi-Attention Learning for Abstractive Text
Summarization [86.45110800123216]
In the task of text summarization, salience estimation for words, phrases or sentences is a critical component.
We propose a Multi-Attention Learning framework which contains two new attention learning components for salience estimation.
arXiv Detail & Related papers (2020-04-07T02:38:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.