Related papers: IPL: Leveraging Multimodal Large Language Models for Intelligent Product Listing

IPL: Leveraging Multimodal Large Language Models for Intelligent Product Listing

URL: http://arxiv.org/abs/2410.16977v1
Date: Tue, 22 Oct 2024 12:56:04 GMT
Title: IPL: Leveraging Multimodal Large Language Models for Intelligent Product Listing
Authors: Kang Chen, Qingheng Zhang, Chengbao Lian, Yixin Ji, Xuwei Liu, Shuguang Han, Guoqiang Wu, Fei Huang, Jufeng Chen,
Abstract summary: We develop IPL, an Intelligent Product Listing tool tailored to generate descriptions using various product attributes. IPL has been successfully deployed in our production system, where 72% of users have their published product listings based on the generated content. Those product listings are shown to have a quality score 5.6% higher than those without AI assistance.
Score: 26.930588075458008
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Unlike professional Business-to-Consumer (B2C) e-commerce platforms (e.g., Amazon), Consumer-to-Consumer (C2C) platforms (e.g., Facebook marketplace) are mainly targeting individual sellers who usually lack sufficient experience in e-commerce. Individual sellers often struggle to compose proper descriptions for selling products. With the recent advancement of Multimodal Large Language Models (MLLMs), we attempt to integrate such state-of-the-art generative AI technologies into the product listing process. To this end, we develop IPL, an Intelligent Product Listing tool tailored to generate descriptions using various product attributes such as category, brand, color, condition, etc. IPL enables users to compose product descriptions by merely uploading photos of the selling product. More importantly, it can imitate the content style of our C2C platform Xianyu. This is achieved by employing domain-specific instruction tuning on MLLMs and adopting the multi-modal Retrieval-Augmented Generation (RAG) process. A comprehensive empirical evaluation demonstrates that the underlying model of IPL significantly outperforms the base model in domain-specific tasks while producing less hallucination. IPL has been successfully deployed in our production system, where 72% of users have their published product listings based on the generated content, and those product listings are shown to have a quality score 5.6% higher than those without AI assistance.

Related papers

EcomScriptBench: A Multi-task Benchmark for E-commerce Script Planning via Step-wise Intention-Driven Product Association [83.4879773429742]
This paper defines the task of E-commerce Script Planning (EcomScript) as three sequential subtasks.<n>We propose a novel framework that enables the scalable generation of product-enriched scripts by associating products with each step.<n>We construct the very first large-scale EcomScript dataset, EcomScriptBench, which includes 605,229 scripts sourced from 2.4 million products.
arXiv Detail & Related papers (2025-05-21T07:21:38Z)
CTR-Driven Advertising Image Generation with Multimodal Large Language Models [53.40005544344148]
We explore the use of Multimodal Large Language Models (MLLMs) for generating advertising images by optimizing for Click-Through Rate (CTR) as the primary objective. To further improve the CTR of generated images, we propose a novel reward model to fine-tune pre-trained MLLMs through Reinforcement Learning (RL) Our method achieves state-of-the-art performance in both online and offline metrics.
arXiv Detail & Related papers (2025-02-05T09:06:02Z)
Auditing the Grid-Based Placement of Private Label Products on E-commerce Search Result Pages [7.351845767369621]
We quantify the extent of private label (PL) product promotion on e-commerce search results for two largest e-commerce platforms operating in India -- Amazon.in and Amazon.in. Both platforms use different strategies to promote their PL products, such as placing more PLs on the advertised positions. We find that these product placement strategies of both platforms conform with existing user attention strategies proposed in the literature.
arXiv Detail & Related papers (2024-07-19T20:01:30Z)
MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding [67.26334044239161]
MIND is a framework that infers purchase intentions from multimodal product metadata and prioritizes human-centric ones. Using Amazon Review data, we create a multimodal intention knowledge base, which contains 1,264,441 million intentions. Our obtained intentions significantly enhance large language models in two intention comprehension tasks.
arXiv Detail & Related papers (2024-06-15T17:56:09Z)
A Multimodal In-Context Tuning Approach for E-Commerce Product Description Generation [47.70824723223262]
We propose a new setting for generating product descriptions from images, augmented by marketing keywords. We present a simple and effective Multimodal In-Context Tuning approach, named ModICT, which introduces a similar product sample as the reference. Experiments demonstrate that ModICT significantly improves the accuracy (by up to 3.3% on Rouge-L) and diversity (by up to 9.4% on D-5) of generated results compared to conventional methods.
arXiv Detail & Related papers (2024-02-21T07:38:29Z)
Leveraging Large Language Models for Enhanced Product Descriptions in eCommerce [6.318353155416729]
This paper introduces a novel methodology for automating product description generation using the LLAMA 2.0 7B language model. We train the model on a dataset of authentic product descriptions from Walmart, one of the largest eCommerce platforms. Our findings reveal that the system is not only scalable but also significantly reduces the human workload involved in creating product descriptions.
arXiv Detail & Related papers (2023-10-24T00:55:14Z)
MMAPS: End-to-End Multi-Grained Multi-Modal Attribute-Aware Product Summarization [93.5217515566437]
Multi-modal Product Summarization (MPS) aims to increase customers' desire to purchase by highlighting product characteristics. Existing MPS methods can produce promising results, but they still lack end-to-end product summarization. We propose an end-to-end multi-modal attribute-aware product summarization method (MMAPS) for generating high-quality product summaries in e-commerce.
arXiv Detail & Related papers (2023-08-22T11:00:09Z)
Automatic Controllable Product Copywriting for E-Commerce [58.97059802658354]
We deploy an E-commerce Prefix-based Controllable Copywriting Generation into the JD.com e-commerce recommendation platform. We conduct experiments to validate the effectiveness of the proposed EPCCG. We introduce the deployed architecture which cooperates with the EPCCG into the real-time JD.com e-commerce recommendation platform.
arXiv Detail & Related papers (2022-06-21T04:18:52Z)
Scenario-based Multi-product Advertising Copywriting Generation for E-Commerce [46.29638014067242]
We propose an automatic Scenario-based Multi-product Advertising Copywriting Generation system (SMPACG) for E-Commerce. The SMPACG has been developed for directly serving for our e-commerce recommendation system, and also used as a real-time writing assistant tool for merchants.
arXiv Detail & Related papers (2022-05-21T07:45:53Z)
Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-modal Pretraining [108.86502855439774]
We investigate a more realistic setting that aims to perform weakly-supervised multi-modal instance-level product retrieval. We contribute Product1M, one of the largest multi-modal cosmetic datasets for real-world instance-level retrieval. We propose a novel model named Cross-modal contrAstive Product Transformer for instance-level prodUct REtrieval (CAPTURE)
arXiv Detail & Related papers (2021-07-30T12:11:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.