Related papers: Product Attribute Value Extraction using Large Language Models

Product Attribute Value Extraction using Large Language Models

URL: http://arxiv.org/abs/2310.12537v2
Date: Fri, 26 Jan 2024 09:07:59 GMT
Title: Product Attribute Value Extraction using Large Language Models
Authors: Alexander Brinkmann, Roee Shraga, Christian Bizer
Abstract summary: State-of-the-art attribute/value extraction methods based on pre-trained language models (PLMs) face two drawbacks. We explore the potential of using large language models (LLMs) as a more training data-efficient and more robust alternative to existing attribute/value extraction methods.
Score: 56.96665345570965
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: E-commerce platforms rely on structured product descriptions, in the form of attribute/value pairs to enable features such as faceted product search and product comparison. However, vendors on these platforms often provide unstructured product descriptions consisting of a title and a textual description. To process such offers, e-commerce platforms must extract attribute/value pairs from the unstructured descriptions. State-of-the-art attribute/value extraction methods based on pre-trained language models (PLMs), such as BERT, face two drawbacks (i) the methods require significant amounts of task-specific training data and (ii) the fine-tuned models have problems to generalize to attribute values that were not part of the training data. We explore the potential of using large language models (LLMs) as a more training data-efficient and more robust alternative to existing attribute/value extraction methods. We propose different prompt templates for instructing LLMs about the target schema of the extraction, covering both zero-shot and few-shot scenarios. In the zero-shot scenario, textual and JSON-based approaches for representing information about the target attributes are compared. In the scenario with training data, we investigate (i) the provision of example attribute values, (ii) the selection of in-context demonstrations, (iii) shuffled ensembling to prevent position bias, and (iv) fine-tuning the LLM. The prompt templates are evaluated in combination with hosted LLMs, such as GPT-3.5 and GPT-4, and open-source LLMs based on Llama2 which can be run locally. The best average F1-score of 86% was reached by GPT-4 using an ensemble of shuffled prompts that combine attribute names, attribute descriptions, example values, and demonstrations. Given the same amount of training data, this prompt/model combination outperforms the best PLM baseline by an average of 6% F1.

Related papers

Using LLMs for the Extraction and Normalization of Product Attribute Values [47.098255866050835]
This paper explores the potential of using large language models (LLMs) to extract and normalize attribute values from product titles and descriptions. We introduce the Web Data Commons - Product Attribute Value Extraction (WDC-PAVE) benchmark dataset for our experiments.
arXiv Detail & Related papers (2024-03-04T15:39:59Z)
JPAVE: A Generation and Classification-based Model for Joint Product Attribute Prediction and Value Extraction [59.94977231327573]
We propose a multi-task learning model with value generation/classification and attribute prediction called JPAVE. Two variants of our model are designed for open-world and closed-world scenarios. Experimental results on a public dataset demonstrate the superiority of our model compared with strong baselines.
arXiv Detail & Related papers (2023-11-07T18:36:16Z)
Entity Matching using Large Language Models [4.189643331553922]
This paper investigates using generative large language models (LLMs) as a less task-specific training data-dependent alternative to PLM-based matchers. We show that GPT4 can generate structured explanations for matching decisions.
arXiv Detail & Related papers (2023-10-17T13:12:32Z)
Product Information Extraction using ChatGPT [69.12244027050454]
This paper explores the potential of ChatGPT for extracting attribute/value pairs from product descriptions. Our results show that ChatGPT achieves a performance similar to a pre-trained language model but requires much smaller amounts of training data and computation for fine-tuning.
arXiv Detail & Related papers (2023-06-23T09:30:01Z)
A Unified Generative Approach to Product Attribute-Value Identification [6.752749933406399]
We explore a generative approach to the product attribute-value identification (PAVI) task. We finetune a pre-trained generative model, T5, to decode a set of attribute-value pairs as a target sequence from the given product text. Experimental results confirm that our generation-based approach outperforms the existing extraction and classification-based methods.
arXiv Detail & Related papers (2023-06-09T00:33:30Z)
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators [98.11286353828525]
GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks. We propose AnnoLLM, which adopts a two-step approach, explain-then-annotate. We build the first conversation-based information retrieval dataset employing AnnoLLM.
arXiv Detail & Related papers (2023-03-29T17:03:21Z)
Automatic Validation of Textual Attribute Values in E-commerce Catalog by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge. It can learn transferable knowledge from a subset of categories with limited labeled data. It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.