Feature Engineering in Learning-to-Rank for Community Question Answering
Task
- URL: http://arxiv.org/abs/2309.07610v1
- Date: Thu, 14 Sep 2023 11:18:26 GMT
- Title: Feature Engineering in Learning-to-Rank for Community Question Answering
Task
- Authors: Nafis Sajid, Md Rashidul Hasan, Muhammad Ibrahim
- Abstract summary: Community question answering (CQA) forums are Internet-based platforms where users ask questions about a topic and other expert users try to provide solutions.
Many CQA forums such as Quora, Stackoverflow, Yahoo!Answer, StackExchange exist with a lot of user-generated data.
These data are leveraged in automated CQA ranking systems where similar questions (and answers) are presented in response to the query of the user.
- Score: 2.5091819952713057
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Community question answering (CQA) forums are Internet-based platforms where
users ask questions about a topic and other expert users try to provide
solutions. Many CQA forums such as Quora, Stackoverflow, Yahoo!Answer,
StackExchange exist with a lot of user-generated data. These data are leveraged
in automated CQA ranking systems where similar questions (and answers) are
presented in response to the query of the user. In this work, we empirically
investigate a few aspects of this domain. Firstly, in addition to traditional
features like TF-IDF, BM25 etc., we introduce a BERT-based feature that
captures the semantic similarity between the question and answer. Secondly,
most of the existing research works have focused on features extracted only
from the question part; features extracted from answers have not been explored
extensively. We combine both types of features in a linear fashion. Thirdly,
using our proposed concepts, we conduct an empirical investigation with
different rank-learning algorithms, some of which have not been used so far in
CQA domain. On three standard CQA datasets, our proposed framework achieves
state-of-the-art performance. We also analyze importance of the features we use
in our investigation. This work is expected to guide the practitioners to
select a better set of features for the CQA retrieval task.
Related papers
- Best-Answer Prediction in Q&A Sites Using User Information [2.982218441172364]
Community Question Answering (CQA) sites have spread and multiplied significantly in recent years.
One practical way of finding such answers is automatically predicting the best candidate given existing answers and comments.
We address this limitation using a novel method for predicting the best answers using the questioner's background information and other features.
arXiv Detail & Related papers (2022-12-15T02:28:52Z) - RealTime QA: What's the Answer Right Now? [137.04039209995932]
We introduce REALTIME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis.
We build strong baseline models upon large pretrained language models, including GPT-3 and T5.
GPT-3 tends to return outdated answers when retrieved documents do not provide sufficient information to find an answer.
arXiv Detail & Related papers (2022-07-27T07:26:01Z) - Multifaceted Improvements for Conversational Open-Domain Question
Answering [54.913313912927045]
We propose a framework with Multifaceted Improvements for Conversational open-domain Question Answering (MICQA)
Firstly, the proposed KL-divergence based regularization is able to lead to a better question understanding for retrieval and answer reading.
Second, the added post-ranker module can push more relevant passages to the top placements and be selected for reader with a two-aspect constrains.
Third, the well designed curriculum learning strategy effectively narrows the gap between the golden passage settings of training and inference, and encourages the reader to find true answer without the golden passage assistance.
arXiv Detail & Related papers (2022-04-01T07:54:27Z) - HeteroQA: Learning towards Question-and-Answering through Multiple
Information Sources via Heterogeneous Graph Modeling [50.39787601462344]
Community Question Answering (CQA) is a well-defined task that can be used in many scenarios, such as E-Commerce and online user community for special interests.
Most of the CQA methods only incorporate articles or Wikipedia to extract knowledge and answer the user's question.
We propose a question-aware heterogeneous graph transformer to incorporate the multiple information sources (MIS) in the user community to automatically generate the answer.
arXiv Detail & Related papers (2021-12-27T10:16:43Z) - PerCQA: Persian Community Question Answering Dataset [2.503043323723241]
Community Question Answering (CQA) forums provide answers for many real-life questions.
We present PerCQA, the first Persian dataset for CQA.
This dataset contains the questions and answers crawled from the most well-known Persian forum.
arXiv Detail & Related papers (2021-12-25T14:06:41Z) - QAConv: Question Answering on Informative Conversations [85.2923607672282]
We focus on informative conversations including business emails, panel discussions, and work channels.
In total, we collect 34,204 QA pairs, including span-based, free-form, and unanswerable questions.
arXiv Detail & Related papers (2021-05-14T15:53:05Z) - Attention-based model for predicting question relatedness on Stack
Overflow [0.0]
We propose an Attention-based Sentence pair Interaction Model (ASIM) to predict the relatedness between questions on Stack Overflow automatically.
ASIM has made significant improvement over the baseline approaches in Precision, Recall, and Micro-F1 evaluation metrics.
Our model also performs well in the duplicate question detection task of Ask Ubuntu.
arXiv Detail & Related papers (2021-03-19T12:18:03Z) - Diverse and Non-redundant Answer Set Extraction on Community QA based on
DPPs [18.013010857062643]
In community-based question answering platforms, it takes time for a user to get useful information from among many answers.
This paper proposes a new task of selecting a diverse and non-redundant answer set rather than ranking the answers.
arXiv Detail & Related papers (2020-11-18T07:33:03Z) - Few-Shot Complex Knowledge Base Question Answering via Meta
Reinforcement Learning [55.08037694027792]
Complex question-answering (CQA) involves answering complex natural-language questions on a knowledge base (KB)
The conventional neural program induction (NPI) approach exhibits uneven performance when the questions have different types.
This paper proposes a meta-reinforcement learning approach to program induction in CQA to tackle the potential distributional bias in questions.
arXiv Detail & Related papers (2020-10-29T18:34:55Z) - DoQA -- Accessing Domain-Specific FAQs via Conversational QA [25.37327993590628]
We present DoQA, a dataset with 2,437 dialogues and 10,917 QA pairs.
The dialogues are collected from three Stack Exchange sites using the Wizard of Oz method with crowdsourcing.
arXiv Detail & Related papers (2020-05-04T08:58:54Z) - Unsupervised Question Decomposition for Question Answering [102.56966847404287]
We propose an algorithm for One-to-N Unsupervised Sequence Sequence (ONUS) that learns to map one hard, multi-hop question to many simpler, single-hop sub-questions.
We show large QA improvements on HotpotQA over a strong baseline on the original, out-of-domain, and multi-hop dev sets.
arXiv Detail & Related papers (2020-02-22T19:40:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.