Related papers: Answer Generation for Questions With Multiple Information Sources in E-Commerce

Answer Generation for Questions With Multiple Information Sources in E-Commerce

URL: http://arxiv.org/abs/2111.14003v1
Date: Sat, 27 Nov 2021 23:19:49 GMT
Title: Answer Generation for Questions With Multiple Information Sources in E-Commerce
Authors: Anand A. Rajasekar, Nikesh Garera
Abstract summary: We propose a novel pipeline (MSQAP) that utilizes the rich information present in the aforementioned sources by separately performing relevancy and ambiguity prediction. This is the first work in the e-commerce domain that automatically generates natural language answers combining the information present in diverse sources such as specifications, similar questions, and reviews data.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Automatic question answering is an important yet challenging task in E-commerce given the millions of questions posted by users about the product that they are interested in purchasing. Hence, there is a great demand for automatic answer generation systems that provide quick responses using related information about the product. There are three sources of knowledge available for answering a user posted query, they are reviews, duplicate or similar questions, and specifications. Effectively utilizing these information sources will greatly aid us in answering complex questions. However, there are two main challenges present in exploiting these sources: (i) The presence of irrelevant information and (ii) the presence of ambiguity of sentiment present in reviews and similar questions. Through this work we propose a novel pipeline (MSQAP) that utilizes the rich information present in the aforementioned sources by separately performing relevancy and ambiguity prediction before generating a response. Experimental results show that our relevancy prediction model (BERT-QA) outperforms all other variants and has an improvement of 12.36% in F1 score compared to the BERT-base baseline. Our generation model (T5-QA) outperforms the baselines in all content preservation metrics such as BLEU, ROUGE and has an average improvement of 35.02% in ROUGE and 198.75% in BLEU compared to the highest performing baseline (HSSC-q). Human evaluation of our pipeline shows us that our method has an overall improvement in accuracy of 30.7% over the generation model (T5-QA), resulting in our full pipeline-based approach (MSQAP) providing more accurate answers. To the best of our knowledge, this is the first work in the e-commerce domain that automatically generates natural language answers combining the information present in diverse sources such as specifications, similar questions, and reviews data.

Related papers

RAG-based Question Answering over Heterogeneous Data and Text [23.075485587443485]
This article presents the QUASAR system for question answering over unstructured text, structured tables, and knowledge graphs. The system adopts a RAG-based architecture, with a pipeline of evidence retrieval followed by answer generation, with the latter powered by a moderate-sized language model. Experiments with three different benchmarks demonstrate the high answering quality of our approach, being on par with or better than large GPT models.
arXiv Detail & Related papers (2024-12-10T11:18:29Z)
The Structure of Financial Equity Research Reports -- Identification of the Most Frequently Asked Questions in Financial Analyst Reports to Automate Equity Research Using Llama 3 and GPT-4 [6.085131799375494]
The study analyzes 72 ERRs sentence-by-sentence, classifying their 48.7% sentences into 169 unique question archetypes. We did not predefine the questions but derived them solely from the statements in the ERRs. The research confirms that the current writing process of ERRs can likely benefit from additional automation, improving quality and efficiency.
arXiv Detail & Related papers (2024-07-04T15:58:02Z)
Prompt-Engineering and Transformer-based Question Generation and Evaluation [0.0]
This paper aims to find the best method to generate questions from textual data through a transformer model and prompt engineering. The generated questions were compared against the baseline questions in the SQuAD dataset to evaluate the effectiveness of four different prompts.
arXiv Detail & Related papers (2023-10-29T01:45:30Z)
UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models [55.22048505787125]
This paper contributes a comprehensive dataset, called UNK-VQA. We first augment the existing data via deliberate perturbations on either the image or question. We then extensively evaluate the zero- and few-shot performance of several emerging multi-modal large models.
arXiv Detail & Related papers (2023-10-17T02:38:09Z)
ExpertQA: Expert-Curated Questions and Attributed Answers [51.68314045809179]
We conduct human evaluation of responses from a few representative systems along various axes of attribution and factuality. We collect expert-curated questions from 484 participants across 32 fields of study, and then ask the same experts to evaluate generated responses to their own questions. The output of our analysis is ExpertQA, a high-quality long-form QA dataset with 2177 questions spanning 32 fields, along with verified answers and attributions for claims in the answers.
arXiv Detail & Related papers (2023-09-14T16:54:34Z)
An Empirical Comparison of LM-based Question and Answer Generation Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z)
KEPR: Knowledge Enhancement and Plausibility Ranking for Generative Commonsense Question Answering [11.537283115693432]
We propose a Knowledge Enhancement and Plausibility Ranking approach grounded on the Generate-Then-Rank pipeline architecture. Specifically, we expand questions in terms of Wiktionary commonsense knowledge of keywords, and reformulate them with normalized patterns. We develop an ELECTRA-based answer ranking model, where logistic regression is conducted during training, with the aim of approxing different levels of plausibility.
arXiv Detail & Related papers (2023-05-15T04:58:37Z)
HeteroQA: Learning towards Question-and-Answering through Multiple Information Sources via Heterogeneous Graph Modeling [50.39787601462344]
Community Question Answering (CQA) is a well-defined task that can be used in many scenarios, such as E-Commerce and online user community for special interests. Most of the CQA methods only incorporate articles or Wikipedia to extract knowledge and answer the user's question. We propose a question-aware heterogeneous graph transformer to incorporate the multiple information sources (MIS) in the user community to automatically generate the answer.
arXiv Detail & Related papers (2021-12-27T10:16:43Z)
Question Answering Survey: Directions, Challenges, Datasets, Evaluation Matrices [0.0]
The research directions of QA field are analyzed based on the type of question, answer type, source of evidence-answer, and modeling approach. This detailed followed by open challenges of the field like automatic question generation, similarity detection and, low resource availability for a language.
arXiv Detail & Related papers (2021-12-07T08:53:40Z)
Relation-Guided Pre-Training for Open-Domain Question Answering [67.86958978322188]
We propose a Relation-Guided Pre-Training (RGPT-QA) framework to solve complex open-domain questions. We show that RGPT-QA achieves 2.2%, 2.4%, and 6.3% absolute improvement in Exact Match accuracy on Natural Questions, TriviaQA, and WebQuestions.
arXiv Detail & Related papers (2021-09-21T17:59:31Z)
Will this Question be Answered? Question Filtering via Answer Model Distillation for Efficient Question Answering [99.66470885217623]
We propose a novel approach towards improving the efficiency of Question Answering (QA) systems by filtering out questions that will not be answered by them. This is based on an interesting new finding: the answer confidence scores of state-of-the-art QA systems can be approximated well by models solely using the input question text.
arXiv Detail & Related papers (2021-09-14T23:07:49Z)
Summary-Oriented Question Generation for Informational Queries [23.72999724312676]
We aim to produce self-explanatory questions that focus on main document topics and are answerable with variable length passages as appropriate. Our model shows SOTA performance of SQ generation on the NQ dataset (20.1 BLEU-4). We further apply our model on out-of-domain news articles, evaluating with a QA system due to the lack of gold questions and demonstrate that our model produces better SQs for news articles -- with further confirmation via a human evaluation.
arXiv Detail & Related papers (2020-10-19T17:30:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.