Related papers: PAE: LLM-based Product Attribute Extraction for E-Commerce Fashion Trends

PAE: LLM-based Product Attribute Extraction for E-Commerce Fashion Trends

URL: http://arxiv.org/abs/2405.17533v1
Date: Mon, 27 May 2024 17:50:25 GMT
Title: PAE: LLM-based Product Attribute Extraction for E-Commerce Fashion Trends
Authors: Apurva Sinha, Ekta Gujral,
Abstract summary: This paper presents PAE, a product attribute extraction algorithm for future trend reports consisting text and images in PDF format. Our contributions are three-fold: (a) We develop PAE, an efficient framework to extract attributes from unstructured data (text and images); (b) We provide catalog matching methodology based on BERT representations to discover the existing attributes using upcoming attribute values; (c) We conduct extensive experiments with several baselines and show that PAE is an effective, flexible and on par or superior (avg 92.5% F1-Score) framework to existing state-of-the-art for attribute value extraction
Score: 0.6445605125467574
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Product attribute extraction is an growing field in e-commerce business, with several applications including product ranking, product recommendation, future assortment planning and improving online shopping customer experiences. Understanding the customer needs is critical part of online business, specifically fashion products. Retailers uses assortment planning to determine the mix of products to offer in each store and channel, stay responsive to market dynamics and to manage inventory and catalogs. The goal is to offer the right styles, in the right sizes and colors, through the right channels. When shoppers find products that meet their needs and desires, they are more likely to return for future purchases, fostering customer loyalty. Product attributes are a key factor in assortment planning. In this paper we present PAE, a product attribute extraction algorithm for future trend reports consisting text and images in PDF format. Most existing methods focus on attribute extraction from titles or product descriptions or utilize visual information from existing product images. Compared to the prior works, our work focuses on attribute extraction from PDF files where upcoming fashion trends are explained. This work proposes a more comprehensive framework that fully utilizes the different modalities for attribute extraction and help retailers to plan the assortment in advance. Our contributions are three-fold: (a) We develop PAE, an efficient framework to extract attributes from unstructured data (text and images); (b) We provide catalog matching methodology based on BERT representations to discover the existing attributes using upcoming attribute values; (c) We conduct extensive experiments with several baselines and show that PAE is an effective, flexible and on par or superior (avg 92.5% F1-Score) framework to existing state-of-the-art for attribute value extraction task.

Related papers

Self-Refinement Strategies for LLM-based Product Attribute Value Extraction [51.45146101802871]
This paper investigates applying two self-refinement techniques to the product attribute value extraction task. The experiments show that both self-refinement techniques fail to significantly improve the extraction performance while substantially increasing processing costs. For scenarios with development data, fine-tuning yields the highest performance, while the ramp-up costs of fine-tuning are balanced out as the amount of product descriptions increases.
arXiv Detail & Related papers (2025-01-02T12:55:27Z)
Enhanced E-Commerce Attribute Extraction: Innovating with Decorative Relation Correction and LLAMA 2.0-Based Annotation [4.81846973621209]
We propose a pioneering framework that integrates BERT for classification, a Conditional Random Fields (CRFs) layer for attribute value extraction, and Large Language Models (LLMs) for data annotation. Our approach capitalizes on the robust representation learning of BERT, synergized with the sequence decoding prowess of CRFs, to adeptly identify and extract attribute values. Our methodology is rigorously validated on various datasets, including Walmart, BestBuy's e-commerce NER dataset, and the CoNLL dataset.
arXiv Detail & Related papers (2023-12-09T08:26:30Z)
ExtractGPT: Exploring the Potential of Large Language Models for Product Attribute Value Extraction [52.14681890859275]
E-commerce platforms require structured product data in the form of attribute-value pairs. BERT-based extraction methods require large amounts of task-specific training data. This paper explores using large language models (LLMs) as a more training-data efficient and robust alternative.
arXiv Detail & Related papers (2023-10-19T07:39:00Z)
MMAPS: End-to-End Multi-Grained Multi-Modal Attribute-Aware Product Summarization [93.5217515566437]
Multi-modal Product Summarization (MPS) aims to increase customers' desire to purchase by highlighting product characteristics. Existing MPS methods can produce promising results, but they still lack end-to-end product summarization. We propose an end-to-end multi-modal attribute-aware product summarization method (MMAPS) for generating high-quality product summaries in e-commerce.
arXiv Detail & Related papers (2023-08-22T11:00:09Z)
Multi-task Item-attribute Graph Pre-training for Strict Cold-start Item Recommendation [71.5871100348448]
ColdGPT models item-attribute correlations into an item-attribute graph by extracting fine-grained attributes from item contents. ColdGPT transfers knowledge into the item-attribute graph from various available data sources, i.e., item contents, historical purchase sequences, and review texts of the existing items. Extensive experiments show that ColdGPT consistently outperforms the existing SCS recommenders by large margins.
arXiv Detail & Related papers (2023-06-26T07:04:47Z)
Product Information Extraction using ChatGPT [69.12244027050454]
This paper explores the potential of ChatGPT for extracting attribute/value pairs from product descriptions. Our results show that ChatGPT achieves a performance similar to a pre-trained language model but requires much smaller amounts of training data and computation for fine-tuning.
arXiv Detail & Related papers (2023-06-23T09:30:01Z)
Visually Similar Products Retrieval for Shopsy [0.0]
We design a visual search system for reseller commerce using a multi-task learning approach. Our model consists of three different tasks: attribute classification, triplet ranking and variational autoencoder (VAE)
arXiv Detail & Related papers (2022-10-10T10:59:18Z)
OA-Mine: Open-World Attribute Mining for E-Commerce Products with Weak Supervision [93.26737878221073]
We study the attribute mining problem in an open-world setting to extract novel attributes and their values. We propose a principled framework that first generates attribute value candidates and then groups them into clusters of attributes. Our model significantly outperforms strong baselines and can generalize to unseen attributes and product types.
arXiv Detail & Related papers (2022-04-29T04:16:04Z)
PAM: Understanding Product Images in Cross Product Category Attribute Extraction [40.332066960433245]
This work proposes a more inclusive framework that fully utilizes different modalities for attribute extraction. Inspired by recent works in visual question answering, we use a transformer based sequence to sequence model to fuse representations of product text, Optical Character Recognition (OCR) tokens and visual objects detected in the product image. The framework is further extended with the capability to extract attribute value across multiple product categories with a single model.
arXiv Detail & Related papers (2021-06-08T18:30:17Z)
Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product [40.46223408546036]
Product attribute values are essential in many e-commerce scenarios, such as customer service robots, product recommendations, and product retrieval. While in the real world, the attribute values of a product are usually incomplete and vary over time, which greatly hinders the practical applications. We propose a multimodal method to jointly predict product attributes and extract values from textual product descriptions with the help of the product images.
arXiv Detail & Related papers (2020-09-15T15:10:51Z)
Automatic Validation of Textual Attribute Values in E-commerce Catalog by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge. It can learn transferable knowledge from a subset of categories with limited labeled data. It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.