PAM: Understanding Product Images in Cross Product Category Attribute
Extraction
- URL: http://arxiv.org/abs/2106.04630v1
- Date: Tue, 8 Jun 2021 18:30:17 GMT
- Title: PAM: Understanding Product Images in Cross Product Category Attribute
Extraction
- Authors: Rongmei Lin, Xiang He, Jie Feng, Nasser Zalmout, Yan Liang, Li Xiong,
Xin Luna Dong
- Abstract summary: This work proposes a more inclusive framework that fully utilizes different modalities for attribute extraction.
Inspired by recent works in visual question answering, we use a transformer based sequence to sequence model to fuse representations of product text, Optical Character Recognition (OCR) tokens and visual objects detected in the product image.
The framework is further extended with the capability to extract attribute value across multiple product categories with a single model.
- Score: 40.332066960433245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding product attributes plays an important role in improving online
shopping experience for customers and serves as an integral part for
constructing a product knowledge graph. Most existing methods focus on
attribute extraction from text description or utilize visual information from
product images such as shape and color. Compared to the inputs considered in
prior works, a product image in fact contains more information, represented by
a rich mixture of words and visual clues with a layout carefully designed to
impress customers. This work proposes a more inclusive framework that fully
utilizes these different modalities for attribute extraction. Inspired by
recent works in visual question answering, we use a transformer based sequence
to sequence model to fuse representations of product text, Optical Character
Recognition (OCR) tokens and visual objects detected in the product image. The
framework is further extended with the capability to extract attribute value
across multiple product categories with a single model, by training the decoder
to predict both product category and attribute value and conditioning its
output on product category. The model provides a unified attribute extraction
solution desirable at an e-commerce platform that offers numerous product
categories with a diverse body of product attributes. We evaluated the model on
two product attributes, one with many possible values and one with a small set
of possible values, over 14 product categories and found the model could
achieve 15% gain on the Recall and 10% gain on the F1 score compared to
existing methods using text-only features.
Related papers
- PAE: LLM-based Product Attribute Extraction for E-Commerce Fashion Trends [0.6445605125467574]
This paper presents PAE, a product attribute extraction algorithm for future trend reports consisting text and images in PDF format.
Our contributions are three-fold: (a) We develop PAE, an efficient framework to extract attributes from unstructured data (text and images); (b) We provide catalog matching methodology based on BERT representations to discover the existing attributes using upcoming attribute values; (c) We conduct extensive experiments with several baselines and show that PAE is an effective, flexible and on par or superior (avg 92.5% F1-Score) framework to existing state-of-the-art for attribute value extraction
arXiv Detail & Related papers (2024-05-27T17:50:25Z) - MMAPS: End-to-End Multi-Grained Multi-Modal Attribute-Aware Product
Summarization [93.5217515566437]
Multi-modal Product Summarization (MPS) aims to increase customers' desire to purchase by highlighting product characteristics.
Existing MPS methods can produce promising results, but they still lack end-to-end product summarization.
We propose an end-to-end multi-modal attribute-aware product summarization method (MMAPS) for generating high-quality product summaries in e-commerce.
arXiv Detail & Related papers (2023-08-22T11:00:09Z) - Product Information Extraction using ChatGPT [69.12244027050454]
This paper explores the potential of ChatGPT for extracting attribute/value pairs from product descriptions.
Our results show that ChatGPT achieves a performance similar to a pre-trained language model but requires much smaller amounts of training data and computation for fine-tuning.
arXiv Detail & Related papers (2023-06-23T09:30:01Z) - Large Scale Generative Multimodal Attribute Extraction for E-commerce
Attributes [23.105116746332506]
E-commerce websites (e.g. Amazon) have a plethora of structured and unstructured information (text and images) present on the product pages.
Sellers often either don't label or mislabel values of the attributes (e.g. color, size etc.) for their products.
We present a scalable solution for this problem using textbfMXT, consisting of three key components.
arXiv Detail & Related papers (2023-06-01T06:21:45Z) - Unified Vision-Language Representation Modeling for E-Commerce
Same-Style Products Retrieval [12.588713044749177]
Same-style products retrieval plays an important role in e-commerce platforms.
We propose a unified vision-language modeling method for e-commerce same-style products retrieval.
It is capable of cross-modal product-to-product retrieval, as well as style transfer and user-interactive search.
arXiv Detail & Related papers (2023-02-10T07:24:23Z) - OA-Mine: Open-World Attribute Mining for E-Commerce Products with Weak
Supervision [93.26737878221073]
We study the attribute mining problem in an open-world setting to extract novel attributes and their values.
We propose a principled framework that first generates attribute value candidates and then groups them into clusters of attributes.
Our model significantly outperforms strong baselines and can generalize to unseen attributes and product types.
arXiv Detail & Related papers (2022-04-29T04:16:04Z) - AdaTag: Multi-Attribute Value Extraction from Product Profiles with
Adaptive Decoding [55.89773725577615]
We present AdaTag, which uses adaptive decoding to handle attribute extraction.
Our experiments on a real-world e-Commerce dataset show marked improvements over previous methods.
arXiv Detail & Related papers (2021-06-04T07:54:11Z) - Multimodal Joint Attribute Prediction and Value Extraction for
E-commerce Product [40.46223408546036]
Product attribute values are essential in many e-commerce scenarios, such as customer service robots, product recommendations, and product retrieval.
While in the real world, the attribute values of a product are usually incomplete and vary over time, which greatly hinders the practical applications.
We propose a multimodal method to jointly predict product attributes and extract values from textual product descriptions with the help of the product images.
arXiv Detail & Related papers (2020-09-15T15:10:51Z) - Automatic Validation of Textual Attribute Values in E-commerce Catalog
by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge.
It can learn transferable knowledge from a subset of categories with limited labeled data.
It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.