IPL: Leveraging Multimodal Large Language Models for Intelligent Product Listing
- URL: http://arxiv.org/abs/2410.16977v1
- Date: Tue, 22 Oct 2024 12:56:04 GMT
- Title: IPL: Leveraging Multimodal Large Language Models for Intelligent Product Listing
- Authors: Kang Chen, Qingheng Zhang, Chengbao Lian, Yixin Ji, Xuwei Liu, Shuguang Han, Guoqiang Wu, Fei Huang, Jufeng Chen,
- Abstract summary: We develop IPL, an Intelligent Product Listing tool tailored to generate descriptions using various product attributes.
IPL has been successfully deployed in our production system, where 72% of users have their published product listings based on the generated content.
Those product listings are shown to have a quality score 5.6% higher than those without AI assistance.
- Score: 26.930588075458008
- License:
- Abstract: Unlike professional Business-to-Consumer (B2C) e-commerce platforms (e.g., Amazon), Consumer-to-Consumer (C2C) platforms (e.g., Facebook marketplace) are mainly targeting individual sellers who usually lack sufficient experience in e-commerce. Individual sellers often struggle to compose proper descriptions for selling products. With the recent advancement of Multimodal Large Language Models (MLLMs), we attempt to integrate such state-of-the-art generative AI technologies into the product listing process. To this end, we develop IPL, an Intelligent Product Listing tool tailored to generate descriptions using various product attributes such as category, brand, color, condition, etc. IPL enables users to compose product descriptions by merely uploading photos of the selling product. More importantly, it can imitate the content style of our C2C platform Xianyu. This is achieved by employing domain-specific instruction tuning on MLLMs and adopting the multi-modal Retrieval-Augmented Generation (RAG) process. A comprehensive empirical evaluation demonstrates that the underlying model of IPL significantly outperforms the base model in domain-specific tasks while producing less hallucination. IPL has been successfully deployed in our production system, where 72% of users have their published product listings based on the generated content, and those product listings are shown to have a quality score 5.6% higher than those without AI assistance.
Related papers
- Auditing the Grid-Based Placement of Private Label Products on E-commerce Search Result Pages [7.351845767369621]
We quantify the extent of private label (PL) product promotion on e-commerce search results for two largest e-commerce platforms operating in India -- Amazon.in and Amazon.in.
Both platforms use different strategies to promote their PL products, such as placing more PLs on the advertised positions.
We find that these product placement strategies of both platforms conform with existing user attention strategies proposed in the literature.
arXiv Detail & Related papers (2024-07-19T20:01:30Z) - MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding [67.26334044239161]
MIND is a framework that infers purchase intentions from multimodal product metadata and prioritizes human-centric ones.
Using Amazon Review data, we create a multimodal intention knowledge base, which contains 1,264,441 million intentions.
Our obtained intentions significantly enhance large language models in two intention comprehension tasks.
arXiv Detail & Related papers (2024-06-15T17:56:09Z) - A Multimodal In-Context Tuning Approach for E-Commerce Product
Description Generation [47.70824723223262]
We propose a new setting for generating product descriptions from images, augmented by marketing keywords.
We present a simple and effective Multimodal In-Context Tuning approach, named ModICT, which introduces a similar product sample as the reference.
Experiments demonstrate that ModICT significantly improves the accuracy (by up to 3.3% on Rouge-L) and diversity (by up to 9.4% on D-5) of generated results compared to conventional methods.
arXiv Detail & Related papers (2024-02-21T07:38:29Z) - Leveraging Large Language Models for Enhanced Product Descriptions in
eCommerce [6.318353155416729]
This paper introduces a novel methodology for automating product description generation using the LLAMA 2.0 7B language model.
We train the model on a dataset of authentic product descriptions from Walmart, one of the largest eCommerce platforms.
Our findings reveal that the system is not only scalable but also significantly reduces the human workload involved in creating product descriptions.
arXiv Detail & Related papers (2023-10-24T00:55:14Z) - MMAPS: End-to-End Multi-Grained Multi-Modal Attribute-Aware Product
Summarization [93.5217515566437]
Multi-modal Product Summarization (MPS) aims to increase customers' desire to purchase by highlighting product characteristics.
Existing MPS methods can produce promising results, but they still lack end-to-end product summarization.
We propose an end-to-end multi-modal attribute-aware product summarization method (MMAPS) for generating high-quality product summaries in e-commerce.
arXiv Detail & Related papers (2023-08-22T11:00:09Z) - Automatic Controllable Product Copywriting for E-Commerce [58.97059802658354]
We deploy an E-commerce Prefix-based Controllable Copywriting Generation into the JD.com e-commerce recommendation platform.
We conduct experiments to validate the effectiveness of the proposed EPCCG.
We introduce the deployed architecture which cooperates with the EPCCG into the real-time JD.com e-commerce recommendation platform.
arXiv Detail & Related papers (2022-06-21T04:18:52Z) - Scenario-based Multi-product Advertising Copywriting Generation for
E-Commerce [46.29638014067242]
We propose an automatic Scenario-based Multi-product Advertising Copywriting Generation system (SMPACG) for E-Commerce.
The SMPACG has been developed for directly serving for our e-commerce recommendation system, and also used as a real-time writing assistant tool for merchants.
arXiv Detail & Related papers (2022-05-21T07:45:53Z) - Product1M: Towards Weakly Supervised Instance-Level Product Retrieval
via Cross-modal Pretraining [108.86502855439774]
We investigate a more realistic setting that aims to perform weakly-supervised multi-modal instance-level product retrieval.
We contribute Product1M, one of the largest multi-modal cosmetic datasets for real-world instance-level retrieval.
We propose a novel model named Cross-modal contrAstive Product Transformer for instance-level prodUct REtrieval (CAPTURE)
arXiv Detail & Related papers (2021-07-30T12:11:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.