Product Attribute Value Extraction using Large Language Models
- URL: http://arxiv.org/abs/2310.12537v2
- Date: Fri, 26 Jan 2024 09:07:59 GMT
- Title: Product Attribute Value Extraction using Large Language Models
- Authors: Alexander Brinkmann, Roee Shraga, Christian Bizer
- Abstract summary: State-of-the-art attribute/value extraction methods based on pre-trained language models (PLMs) face two drawbacks.
We explore the potential of using large language models (LLMs) as a more training data-efficient and more robust alternative to existing attribute/value extraction methods.
- Score: 56.96665345570965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: E-commerce platforms rely on structured product descriptions, in the form of
attribute/value pairs to enable features such as faceted product search and
product comparison. However, vendors on these platforms often provide
unstructured product descriptions consisting of a title and a textual
description. To process such offers, e-commerce platforms must extract
attribute/value pairs from the unstructured descriptions. State-of-the-art
attribute/value extraction methods based on pre-trained language models (PLMs),
such as BERT, face two drawbacks (i) the methods require significant amounts of
task-specific training data and (ii) the fine-tuned models have problems to
generalize to attribute values that were not part of the training data. We
explore the potential of using large language models (LLMs) as a more training
data-efficient and more robust alternative to existing attribute/value
extraction methods. We propose different prompt templates for instructing LLMs
about the target schema of the extraction, covering both zero-shot and few-shot
scenarios. In the zero-shot scenario, textual and JSON-based approaches for
representing information about the target attributes are compared. In the
scenario with training data, we investigate (i) the provision of example
attribute values, (ii) the selection of in-context demonstrations, (iii)
shuffled ensembling to prevent position bias, and (iv) fine-tuning the LLM. The
prompt templates are evaluated in combination with hosted LLMs, such as GPT-3.5
and GPT-4, and open-source LLMs based on Llama2 which can be run locally. The
best average F1-score of 86% was reached by GPT-4 using an ensemble of shuffled
prompts that combine attribute names, attribute descriptions, example values,
and demonstrations. Given the same amount of training data, this prompt/model
combination outperforms the best PLM baseline by an average of 6% F1.
Related papers
- Using LLMs for the Extraction and Normalization of Product Attribute Values [47.098255866050835]
This paper explores the potential of using large language models (LLMs) to extract and normalize attribute values from product titles and descriptions.
We introduce the Web Data Commons - Product Attribute Value Extraction (WDC-PAVE) benchmark dataset for our experiments.
arXiv Detail & Related papers (2024-03-04T15:39:59Z) - JPAVE: A Generation and Classification-based Model for Joint Product
Attribute Prediction and Value Extraction [59.94977231327573]
We propose a multi-task learning model with value generation/classification and attribute prediction called JPAVE.
Two variants of our model are designed for open-world and closed-world scenarios.
Experimental results on a public dataset demonstrate the superiority of our model compared with strong baselines.
arXiv Detail & Related papers (2023-11-07T18:36:16Z) - Entity Matching using Large Language Models [4.189643331553922]
This paper investigates using generative large language models (LLMs) as a less task-specific training data-dependent alternative to PLM-based matchers.
We show that GPT4 can generate structured explanations for matching decisions.
arXiv Detail & Related papers (2023-10-17T13:12:32Z) - Product Information Extraction using ChatGPT [69.12244027050454]
This paper explores the potential of ChatGPT for extracting attribute/value pairs from product descriptions.
Our results show that ChatGPT achieves a performance similar to a pre-trained language model but requires much smaller amounts of training data and computation for fine-tuning.
arXiv Detail & Related papers (2023-06-23T09:30:01Z) - A Unified Generative Approach to Product Attribute-Value Identification [6.752749933406399]
We explore a generative approach to the product attribute-value identification (PAVI) task.
We finetune a pre-trained generative model, T5, to decode a set of attribute-value pairs as a target sequence from the given product text.
Experimental results confirm that our generation-based approach outperforms the existing extraction and classification-based methods.
arXiv Detail & Related papers (2023-06-09T00:33:30Z) - AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators [98.11286353828525]
GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks.
We propose AnnoLLM, which adopts a two-step approach, explain-then-annotate.
We build the first conversation-based information retrieval dataset employing AnnoLLM.
arXiv Detail & Related papers (2023-03-29T17:03:21Z) - Automatic Validation of Textual Attribute Values in E-commerce Catalog
by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge.
It can learn transferable knowledge from a subset of categories with limited labeled data.
It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.