Simple and Effective Knowledge-Driven Query Expansion for QA-Based
Product Attribute Extraction
- URL: http://arxiv.org/abs/2206.14264v1
- Date: Tue, 28 Jun 2022 19:43:57 GMT
- Title: Simple and Effective Knowledge-Driven Query Expansion for QA-Based
Product Attribute Extraction
- Authors: Keiji Shinzato, Naoki Yoshinaga, Yandi Xia, Wei-Te Chen
- Abstract summary: Key challenge in value extraction from e-commerce sites is how to handle a large number of attributes for diverse products.
We propose a knowledge-driven query expansion based on possible answers (values) of a query (attribute) for QA-based AVE.
- Score: 6.752749933406399
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A key challenge in attribute value extraction (AVE) from e-commerce sites is
how to handle a large number of attributes for diverse products. Although this
challenge is partially addressed by a question answering (QA) approach which
finds a value in product data for a given query (attribute), it does not work
effectively for rare and ambiguous queries. We thus propose simple
knowledge-driven query expansion based on possible answers (values) of a query
(attribute) for QA-based AVE. We retrieve values of a query (attribute) from
the training data to expand the query. We train a model with two tricks,
knowledge dropout and knowledge token mixing, which mimic the imperfection of
the value knowledge in testing. Experimental results on our cleaned version of
AliExpress dataset show that our method improves the performance of AVE (+6.08
macro F1), especially for rare and ambiguous attributes (+7.82 and +6.86 macro
F1, respectively).
Related papers
- Large Language Models for Relevance Judgment in Product Search [48.56992980315751]
High relevance of retrieved and re-ranked items to the search query is the cornerstone of successful product search.
We present an array of techniques for leveraging Large Language Models (LLMs) for automating the relevance judgment of query-item pairs (QIPs) at scale.
Our findings have immediate implications for the growing field of relevance judgment automation in product search.
arXiv Detail & Related papers (2024-06-01T00:52:41Z) - QueryNER: Segmentation of E-commerce Queries [12.563241705572409]
We present a manually-annotated dataset and accompanying model for e-commerce query segmentation.
Our work instead focuses on the goal of dividing a query into meaningful chunks with broadly applicable types.
arXiv Detail & Related papers (2024-05-15T16:58:35Z) - EIVEN: Efficient Implicit Attribute Value Extraction using Multimodal LLM [52.016009472409166]
EIVEN is a data- and parameter-efficient generative framework for implicit attribute value extraction.
We introduce a novel Learning-by-Comparison technique to reduce model confusion.
Our experiments reveal that EIVEN significantly outperforms existing methods in extracting implicit attribute values.
arXiv Detail & Related papers (2024-04-13T03:15:56Z) - Enhanced E-Commerce Attribute Extraction: Innovating with Decorative
Relation Correction and LLAMA 2.0-Based Annotation [4.81846973621209]
We propose a pioneering framework that integrates BERT for classification, a Conditional Random Fields (CRFs) layer for attribute value extraction, and Large Language Models (LLMs) for data annotation.
Our approach capitalizes on the robust representation learning of BERT, synergized with the sequence decoding prowess of CRFs, to adeptly identify and extract attribute values.
Our methodology is rigorously validated on various datasets, including Walmart, BestBuy's e-commerce NER dataset, and the CoNLL dataset.
arXiv Detail & Related papers (2023-12-09T08:26:30Z) - Knowledge-Enhanced Multi-Label Few-Shot Product Attribute-Value
Extraction [4.511923587827302]
Existing attribute-value extraction models require large quantities of labeled data for training.
New products with new attribute-value pairs enter the market every day in real-world e-Commerce.
We propose a Knowledge-Enhanced Attentive Framework (KEAF) based on networks to learn more discriminative prototypes.
arXiv Detail & Related papers (2023-08-16T14:58:12Z) - MAVE: A Product Dataset for Multi-source Attribute Value Extraction [10.429320377835241]
We introduce MAVE, a new dataset to better facilitate research on product attribute value extraction.
MAVE is composed of a curated set of 2.2 million products from Amazon pages, with 3 million attribute-value annotations across 1257 unique categories.
We propose a novel approach that effectively extracts the attribute value from the multi-source product information.
arXiv Detail & Related papers (2021-12-16T06:48:31Z) - QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query
Attribute Value Extraction [57.56700153507383]
This paper proposes a unified query attribute value extraction system in e-commerce search named QUEACO.
For the NER phase, QUEACO adopts a novel teacher-student network, where a teacher network that is trained on the strongly-labeled data generates pseudo-labels.
For the AVN phase, we also leverage the weakly-labeled query-to-attribute behavior data to normalize surface form attribute values from queries into canonical forms from products.
arXiv Detail & Related papers (2021-08-19T03:24:23Z) - AdaTag: Multi-Attribute Value Extraction from Product Profiles with
Adaptive Decoding [55.89773725577615]
We present AdaTag, which uses adaptive decoding to handle attribute extraction.
Our experiments on a real-world e-Commerce dataset show marked improvements over previous methods.
arXiv Detail & Related papers (2021-06-04T07:54:11Z) - Learning Compositional Representation for Few-shot Visual Question
Answering [93.4061107793983]
Current methods of Visual Question Answering perform well on the answers with an amount of training data but have limited accuracy on the novel ones with few examples.
We propose to extract the attributes from the answers with enough data, which are later composed to constrain the learning of the few-shot ones.
Experimental results on the VQA v2.0 validation dataset demonstrate the effectiveness of our proposed attribute network.
arXiv Detail & Related papers (2021-02-21T10:16:24Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.