Match$^2$: A Matching over Matching Model for Similar Question
Identification
- URL: http://arxiv.org/abs/2006.11719v1
- Date: Sun, 21 Jun 2020 05:59:34 GMT
- Title: Match$^2$: A Matching over Matching Model for Similar Question
Identification
- Authors: Zizhen Wang, Yixing Fan, Jiafeng Guo, Liu Yang, Ruqing Zhang, Yanyan
Lan, Xueqi Cheng, Hui Jiang, Xiaozhao Wang
- Abstract summary: Community Question Answering (CQA) has become a primary means for people to acquire knowledge, where people are free to ask questions or submit answers.
Similar question identification becomes a core task in CQA which aims to find a similar question from the archived repository whenever a new question is asked.
It has long been a challenge to properly measure the similarity between two questions due to the inherent variation of natural language, i.e., there could be different ways to ask a same question or different questions sharing similar expressions.
Traditional methods typically take a one-side usage, which leverages the answer as some expanded representation of the
- Score: 74.7142127303489
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Community Question Answering (CQA) has become a primary means for people to
acquire knowledge, where people are free to ask questions or submit answers. To
enhance the efficiency of the service, similar question identification becomes
a core task in CQA which aims to find a similar question from the archived
repository whenever a new question is asked. However, it has long been a
challenge to properly measure the similarity between two questions due to the
inherent variation of natural language, i.e., there could be different ways to
ask a same question or different questions sharing similar expressions. To
alleviate this problem, it is natural to involve the existing answers for the
enrichment of the archived questions. Traditional methods typically take a
one-side usage, which leverages the answer as some expanded representation of
the corresponding question. Unfortunately, this may introduce unexpected noises
into the similarity computation since answers are often long and diverse,
leading to inferior performance. In this work, we propose a two-side usage,
which leverages the answer as a bridge of the two questions. The key idea is
based on our observation that similar questions could be addressed by similar
parts of the answer while different questions may not. In other words, we can
compare the matching patterns of the two questions over the same answer to
measure their similarity. In this way, we propose a novel matching over
matching model, namely Match$^2$, which compares the matching patterns between
two question-answer pairs for similar question identification. Empirical
experiments on two benchmark datasets demonstrate that our model can
significantly outperform previous state-of-the-art methods on the similar
question identification task.
Related papers
- QUDSELECT: Selective Decoding for Questions Under Discussion Parsing [90.92351108691014]
Question Under Discussion (QUD) is a discourse framework that uses implicit questions to reveal discourse relationships between sentences.
We introduce QUDSELECT, a joint-training framework that selectively decodes the QUD dependency structures considering the QUD criteria.
Our method outperforms the state-of-the-art baseline models by 9% in human evaluation and 4% in automatic evaluation.
arXiv Detail & Related papers (2024-08-02T06:46:08Z) - Answering Ambiguous Questions with a Database of Questions, Answers, and
Revisions [95.92276099234344]
We present a new state-of-the-art for answering ambiguous questions that exploits a database of unambiguous questions generated from Wikipedia.
Our method improves performance by 15% on recall measures and 10% on measures which evaluate disambiguating questions from predicted outputs.
arXiv Detail & Related papers (2023-08-16T20:23:16Z) - Answering Ambiguous Questions via Iterative Prompting [84.3426020642704]
In open-domain question answering, due to the ambiguity of questions, multiple plausible answers may exist.
One approach is to directly predict all valid answers, but this can struggle with balancing relevance and diversity.
We present AmbigPrompt to address the imperfections of existing approaches to answering ambiguous questions.
arXiv Detail & Related papers (2023-07-08T04:32:17Z) - Selectively Answering Ambiguous Questions [38.83930394700588]
We find that the most reliable approach to decide when to abstain involves quantifying repetition within sampled model outputs.
Our results suggest that sampling-based confidence scores help calibrate answers to relatively unambiguous questions.
arXiv Detail & Related papers (2023-05-24T01:25:38Z) - GTM: A Generative Triple-Wise Model for Conversational Question
Generation [36.33685095934868]
We propose a generative triple-wise model with hierarchical variations for open-domain conversational question generation (CQG)
Our method significantly improves the quality of questions in terms of fluency, coherence and diversity over competitive baselines.
arXiv Detail & Related papers (2021-06-07T14:07:07Z) - Learning with Instance Bundles for Reading Comprehension [61.823444215188296]
We introduce new supervision techniques that compare question-answer scores across multiple related instances.
Specifically, we normalize these scores across various neighborhoods of closely contrasting questions and/or answers.
We empirically demonstrate the effectiveness of training with instance bundles on two datasets.
arXiv Detail & Related papers (2021-04-18T06:17:54Z) - Applying Transfer Learning for Improving Domain-Specific Search
Experience Using Query to Question Similarity [0.0]
We discuss a framework for calculating similarities between a given input query and a set of predefined questions to retrieve the question which matches to it the most.
We have used it for the financial domain, but the framework is generalized for any domain-specific search engine and can be used in other domains as well.
arXiv Detail & Related papers (2021-01-07T03:27:32Z) - Effective FAQ Retrieval and Question Matching With Unsupervised
Knowledge Injection [10.82418428209551]
We propose a contextual language model for retrieving appropriate answers to frequently asked questions.
We also explore to capitalize on domain-specific topically-relevant relations between words in an unsupervised manner.
We evaluate variants of our approach on a publicly-available Chinese FAQ dataset, and further apply and contextualize it to a large-scale question-matching task.
arXiv Detail & Related papers (2020-10-27T05:03:34Z) - A Wrong Answer or a Wrong Question? An Intricate Relationship between
Question Reformulation and Answer Selection in Conversational Question
Answering [15.355557454305776]
We show that question rewriting (QR) of the conversational context allows to shed more light on this phenomenon.
We present the results of this analysis on the TREC CAsT and QuAC (CANARD) datasets.
arXiv Detail & Related papers (2020-10-13T06:29:51Z) - Crossing Variational Autoencoders for Answer Retrieval [50.17311961755684]
Question-answer alignment and question/answer semantics are two important signals for learning the representations.
We propose to cross variational auto-encoders by generating questions with aligned answers and generating answers with aligned questions.
arXiv Detail & Related papers (2020-05-06T01:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.