Solving Price Per Unit Problem Around the World: Formulating Fact
Extraction as Question Answering
- URL: http://arxiv.org/abs/2204.05555v1
- Date: Tue, 12 Apr 2022 06:43:48 GMT
- Title: Solving Price Per Unit Problem Around the World: Formulating Fact
Extraction as Question Answering
- Authors: Tarik Arici, Kushal Kumar, Hayreddin \c{C}eker, Anoop S V K K Saladi,
Ismail Tutar
- Abstract summary: Price Per Unit (PPU) is an essential information for consumers shopping on e-commerce websites when comparing products.
We formulate this problem as a question-answering (QA) task rather than named entity recognition (NER) task for fact extraction.
Our QA approach outperforms rule-based methods by 34.4% in precision and also BERT-based fact extraction approach in all stores globally.
- Score: 4.094848360328624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Price Per Unit (PPU) is an essential information for consumers shopping on
e-commerce websites when comparing products. Finding total quantity in a
product is required for computing PPU, which is not always provided by the
sellers. To predict total quantity, all relevant quantities given in a product
attributes such as title, description and image need to be inferred correctly.
We formulate this problem as a question-answering (QA) task rather than named
entity recognition (NER) task for fact extraction. In our QA approach, we first
predict the unit of measure (UoM) type (e.g., volume, weight or count), that
formulates the desired question (e.g., "What is the total volume?") and then
use this question to find all the relevant answers. Our model architecture
consists of two subnetworks for the two subtasks: a classifier to predict UoM
type (or the question) and an extractor to extract the relevant quantities. We
use a deep character-level CNN architecture for both subtasks, which enables
(1) easy expansion to new stores with similar alphabets, (2) multi-span
answering due to its span-image architecture and (3) easy deployment by keeping
model-inference latency low. Our QA approach outperforms rule-based methods by
34.4% in precision and also BERT-based fact extraction approach in all stores
globally, with largest precision lift of 10.6% in the US store.
Related papers
- Language Models Benefit from Preparation with Elicited Knowledge [0.38233569758620056]
The zero-shot chain of thought (CoT) approach is often used in question answering (QA) by language models (LMs)
We introduce a simple general prompting technique, called PREP, that involves using two instances of LMs.
PREP is designed to be general and independent of the user's domain knowledge, making it applicable across various QA tasks without the need for specialized prompt engineering.
arXiv Detail & Related papers (2024-09-02T15:58:27Z) - Answering Subjective Induction Questions on Products by Summarizing
Multi-sources Multi-viewpoints Knowledge [0.04791377777154766]
This paper proposes a new task in the field of Answering Subjective Induction Question on Products.
The answer to this kind of question is non-unique, but can be interpreted from many perspectives.
A satisfied answer should be able to summarize these subjective opinions from multiple sources and provide objective knowledge.
arXiv Detail & Related papers (2023-09-12T03:27:08Z) - Improving Text Matching in E-Commerce Search with A Rationalizable,
Intervenable and Fast Entity-Based Relevance Model [78.80174696043021]
We propose a novel model called the Entity-Based Relevance Model (EBRM)
The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy.
We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.
arXiv Detail & Related papers (2023-07-01T15:44:53Z) - Long-Tailed Question Answering in an Open World [46.67715607552547]
We define Open Long-Tailed QA (OLTQA) as learning from long-tailed distributed data.
We propose an OLTQA model that encourages knowledge sharing between head, tail and unseen tasks.
On a large-scale OLTQA dataset, our model consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2023-05-11T04:28:58Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - Summarizing Community-based Question-Answer Pairs [5.680726650578754]
We propose the novel CQA summarization task that aims to create a concise summary from CQA pairs.
Our data and code are publicly available.
arXiv Detail & Related papers (2022-11-17T21:09:41Z) - PACIFIC: Towards Proactive Conversational Question Answering over
Tabular and Textual Data in Finance [96.06505049126345]
We present a new dataset, named PACIFIC. Compared with existing CQA datasets, PACIFIC exhibits three key features: (i) proactivity, (ii) numerical reasoning, and (iii) hybrid context of tables and text.
A new task is defined accordingly to study Proactive Conversational Question Answering (PCQA), which combines clarification question generation and CQA.
UniPCQA performs multi-task learning over all sub-tasks in PCQA and incorporates a simple ensemble strategy to alleviate the error propagation issue in the multi-task learning by cross-validating top-$k$ sampled Seq2Seq
arXiv Detail & Related papers (2022-10-17T08:06:56Z) - Answer Generation for Questions With Multiple Information Sources in
E-Commerce [0.0]
We propose a novel pipeline (MSQAP) that utilizes the rich information present in the aforementioned sources by separately performing relevancy and ambiguity prediction.
This is the first work in the e-commerce domain that automatically generates natural language answers combining the information present in diverse sources such as specifications, similar questions, and reviews data.
arXiv Detail & Related papers (2021-11-27T23:19:49Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z) - Tradeoffs in Sentence Selection Techniques for Open-Domain Question
Answering [54.541952928070344]
We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question.
We show that very lightweight QA models can do well at this task, but retrieval-based models are faster still.
arXiv Detail & Related papers (2020-09-18T23:39:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.