Product Attribute Value Extraction using Large Language Models
- URL: http://arxiv.org/abs/2310.12537v2
- Date: Fri, 26 Jan 2024 09:07:59 GMT
- Title: Product Attribute Value Extraction using Large Language Models
- Authors: Alexander Brinkmann, Roee Shraga, Christian Bizer
- Abstract summary: State-of-the-art attribute/value extraction methods based on pre-trained language models (PLMs) face two drawbacks.
We explore the potential of using large language models (LLMs) as a more training data-efficient and more robust alternative to existing attribute/value extraction methods.
- Score: 56.96665345570965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: E-commerce platforms rely on structured product descriptions, in the form of
attribute/value pairs to enable features such as faceted product search and
product comparison. However, vendors on these platforms often provide
unstructured product descriptions consisting of a title and a textual
description. To process such offers, e-commerce platforms must extract
attribute/value pairs from the unstructured descriptions. State-of-the-art
attribute/value extraction methods based on pre-trained language models (PLMs),
such as BERT, face two drawbacks (i) the methods require significant amounts of
task-specific training data and (ii) the fine-tuned models have problems to
generalize to attribute values that were not part of the training data. We
explore the potential of using large language models (LLMs) as a more training
data-efficient and more robust alternative to existing attribute/value
extraction methods. We propose different prompt templates for instructing LLMs
about the target schema of the extraction, covering both zero-shot and few-shot
scenarios. In the zero-shot scenario, textual and JSON-based approaches for
representing information about the target attributes are compared. In the
scenario with training data, we investigate (i) the provision of example
attribute values, (ii) the selection of in-context demonstrations, (iii)
shuffled ensembling to prevent position bias, and (iv) fine-tuning the LLM. The
prompt templates are evaluated in combination with hosted LLMs, such as GPT-3.5
and GPT-4, and open-source LLMs based on Llama2 which can be run locally. The
best average F1-score of 86% was reached by GPT-4 using an ensemble of shuffled
prompts that combine attribute names, attribute descriptions, example values,
and demonstrations. Given the same amount of training data, this prompt/model
combination outperforms the best PLM baseline by an average of 6% F1.
Related papers
- TACLR: A Scalable and Efficient Retrieval-based Method for Industrial Product Attribute Value Identification [19.911923049421137]
We introduce TACLR, the first retrieval-based method for Product Attribute Value Identification (PAVI)
It formulates PAVI as an information retrieval task by encoding product profiles and candidate values into embeddings and retrieving values based on their similarity to the item embedding.
It offers three key advantages: (1) it effectively handles implicit and OOD values while producing normalized outputs; (2) it scales to thousands of categories, tens of thousands of attributes, and millions of values; and (3) it supports efficient inference for high-load industrial scenarios.
arXiv Detail & Related papers (2025-01-07T14:45:30Z) - Self-Refinement Strategies for LLM-based Product Attribute Value Extraction [51.45146101802871]
This paper investigates applying two self-refinement techniques to the product attribute value extraction task.
The experiments show that both self-refinement techniques fail to significantly improve the extraction performance while substantially increasing processing costs.
For scenarios with development data, fine-tuning yields the highest performance, while the ramp-up costs of fine-tuning are balanced out as the amount of product descriptions increases.
arXiv Detail & Related papers (2025-01-02T12:55:27Z) - Exploring Large Language Models for Product Attribute Value Identification [25.890927969633196]
Product attribute value identification (PAVI) involves automatically identifying attributes and their values from product information.
Existing methods rely on fine-tuning pre-trained language models, such as BART and T5.
This paper explores large language models (LLMs), such as LLaMA and Mistral, as data-efficient and robust alternatives for PAVI.
arXiv Detail & Related papers (2024-09-19T12:09:33Z) - Using LLMs for the Extraction and Normalization of Product Attribute Values [47.098255866050835]
This paper explores the potential of using large language models (LLMs) to extract and normalize attribute values from product titles and descriptions.
We introduce the Web Data Commons - Product Attribute Value Extraction (WDC-PAVE) benchmark dataset for our experiments.
arXiv Detail & Related papers (2024-03-04T15:39:59Z) - JPAVE: A Generation and Classification-based Model for Joint Product
Attribute Prediction and Value Extraction [59.94977231327573]
We propose a multi-task learning model with value generation/classification and attribute prediction called JPAVE.
Two variants of our model are designed for open-world and closed-world scenarios.
Experimental results on a public dataset demonstrate the superiority of our model compared with strong baselines.
arXiv Detail & Related papers (2023-11-07T18:36:16Z) - AE-smnsMLC: Multi-Label Classification with Semantic Matching and
Negative Label Sampling for Product Attribute Value Extraction [42.79022954630978]
Product attribute value extraction plays an important role for many real-world applications in e-Commerce such as product search and recommendation.
Previous methods treat it as a sequence labeling task that needs more annotation for position of values in the product text.
We propose a classification model with semantic matching and negative label sampling for attribute value extraction.
arXiv Detail & Related papers (2023-10-11T02:22:28Z) - Product Information Extraction using ChatGPT [69.12244027050454]
This paper explores the potential of ChatGPT for extracting attribute/value pairs from product descriptions.
Our results show that ChatGPT achieves a performance similar to a pre-trained language model but requires much smaller amounts of training data and computation for fine-tuning.
arXiv Detail & Related papers (2023-06-23T09:30:01Z) - A Unified Generative Approach to Product Attribute-Value Identification [6.752749933406399]
We explore a generative approach to the product attribute-value identification (PAVI) task.
We finetune a pre-trained generative model, T5, to decode a set of attribute-value pairs as a target sequence from the given product text.
Experimental results confirm that our generation-based approach outperforms the existing extraction and classification-based methods.
arXiv Detail & Related papers (2023-06-09T00:33:30Z) - AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators [98.11286353828525]
GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks.
We propose AnnoLLM, which adopts a two-step approach, explain-then-annotate.
We build the first conversation-based information retrieval dataset employing AnnoLLM.
arXiv Detail & Related papers (2023-03-29T17:03:21Z) - What Makes Good In-Context Examples for GPT-$3$? [101.99751777056314]
GPT-$3$ has attracted lots of attention due to its superior performance across a wide range of NLP tasks.
Despite its success, we found that the empirical results of GPT-$3$ depend heavily on the choice of in-context examples.
In this work, we investigate whether there are more effective strategies for judiciously selecting in-context examples.
arXiv Detail & Related papers (2021-01-17T23:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.