Divide and Conquer: Text Semantic Matching with Disentangled Keywords
and Intents
- URL: http://arxiv.org/abs/2203.02898v1
- Date: Sun, 6 Mar 2022 07:48:24 GMT
- Title: Divide and Conquer: Text Semantic Matching with Disentangled Keywords
and Intents
- Authors: Yicheng Zou, Hongwei Liu, Tao Gui, Junzhe Wang, Qi Zhang, Meng Tang,
Haixiang Li, Daniel Wang
- Abstract summary: We propose a training strategy for text semantic matching by disentangling keywords from intents.
Our approach can be easily combined with pre-trained language models (PLM) without influencing their inference efficiency.
- Score: 19.035917264711664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text semantic matching is a fundamental task that has been widely used in
various scenarios, such as community question answering, information retrieval,
and recommendation. Most state-of-the-art matching models, e.g., BERT, directly
perform text comparison by processing each word uniformly. However, a query
sentence generally comprises content that calls for different levels of
matching granularity. Specifically, keywords represent factual information such
as action, entity, and event that should be strictly matched, while intents
convey abstract concepts and ideas that can be paraphrased into various
expressions. In this work, we propose a simple yet effective training strategy
for text semantic matching in a divide-and-conquer manner by disentangling
keywords from intents. Our approach can be easily combined with pre-trained
language models (PLM) without influencing their inference efficiency, achieving
stable performance improvements against a wide range of PLMs on three
benchmarks.
Related papers
- Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL)
GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval.
Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z) - A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching [60.51839859852572]
We propose to resolve the text into multi concepts for multilingual semantic matching to liberate the model from the reliance on NER models.
We conduct comprehensive experiments on English datasets QQP and MRPC, and Chinese dataset Medical-SM.
arXiv Detail & Related papers (2024-03-05T13:55:16Z) - Unsupervised extraction of local and global keywords from a single text [0.0]
We propose an unsupervised, corpus-independent method to extract keywords from a single text.
It is based on the spatial distribution of words and the response of this distribution to a random permutation of words.
arXiv Detail & Related papers (2023-07-26T07:36:25Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Improving Multi-task Generalization Ability for Neural Text Matching via
Prompt Learning [54.66399120084227]
Recent state-of-the-art neural text matching models (PLMs) are hard to generalize to different tasks.
We adopt a specialization-generalization training strategy and refer to it as Match-Prompt.
In specialization stage, descriptions of different matching tasks are mapped to only a few prompt tokens.
In generalization stage, text matching model explores the essential matching signals by being trained on diverse multiple matching tasks.
arXiv Detail & Related papers (2022-04-06T11:01:08Z) - EDS-MEMBED: Multi-sense embeddings based on enhanced distributional
semantic structures via a graph walk over word senses [0.0]
We leverage the rich semantic structures in WordNet to enhance the quality of multi-sense embeddings.
We derive new distributional semantic similarity measures for M-SE from prior ones.
We report evaluation results on 11 benchmark datasets involving WSD and Word Similarity tasks.
arXiv Detail & Related papers (2021-02-27T14:36:55Z) - Match-Ignition: Plugging PageRank into Transformer for Long-form Text
Matching [66.71886789848472]
We propose a novel hierarchical noise filtering model, namely Match-Ignition, to tackle the effectiveness and efficiency problem.
The basic idea is to plug the well-known PageRank algorithm into the Transformer, to identify and filter both sentence and word level noisy information.
Noisy sentences are usually easy to detect because the sentence is the basic unit of a long-form text, so we directly use PageRank to filter such information.
arXiv Detail & Related papers (2021-01-16T10:34:03Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z) - Keyword-Attentive Deep Semantic Matching [1.8416014644193064]
We propose a keyword-attentive approach to improve deep semantic matching.
We first leverage domain tags from a large corpus to generate a domain-enhanced keyword dictionary.
During model training, we propose a new negative sampling approach based on keyword coverage between the input pair.
arXiv Detail & Related papers (2020-03-11T10:18:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.