Intent-based Product Collections for E-commerce using Pretrained
Language Models
- URL: http://arxiv.org/abs/2110.08241v1
- Date: Fri, 15 Oct 2021 17:52:42 GMT
- Title: Intent-based Product Collections for E-commerce using Pretrained
Language Models
- Authors: Hiun Kim, Jisu Jeong, Kyung-Min Kim, Dongjun Lee, Hyun Dong Lee,
Dongpil Seo, Jeeseung Han, Dong Wook Park, Ji Ae Heo, Rak Yeong Kim
- Abstract summary: We use a pretrained language model (PLM) that leverages textual attributes of web-scale products to make intent-based product collections.
Our model significantly outperforms the search-based baseline model for intent-based product matching in offline evaluations.
Online experimental results on our e-commerce platform show that the PLM-based method can construct collections of products with increased CTR, CVR, and order-diversity compared to expert-crafted collections.
- Score: 8.847005669899703
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building a shopping product collection has been primarily a human job. With
the manual efforts of craftsmanship, experts collect related but diverse
products with common shopping intent that are effective when displayed
together, e.g., backpacks, laptop bags, and messenger bags for freshman bag
gifts. Automatically constructing a collection requires an ML system to learn a
complex relationship between the customer's intent and the product's
attributes. However, there have been challenging points, such as 1) long and
complicated intent sentences, 2) rich and diverse product attributes, and 3) a
huge semantic gap between them, making the problem difficult. In this paper, we
use a pretrained language model (PLM) that leverages textual attributes of
web-scale products to make intent-based product collections. Specifically, we
train a BERT with triplet loss by setting an intent sentence to an anchor and
corresponding products to positive examples. Also, we improve the performance
of the model by search-based negative sampling and category-wise positive pair
augmentation. Our model significantly outperforms the search-based baseline
model for intent-based product matching in offline evaluations. Furthermore,
online experimental results on our e-commerce platform show that the PLM-based
method can construct collections of products with increased CTR, CVR, and
order-diversity compared to expert-crafted collections.
Related papers
- IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce [71.37481473399559]
In this paper, we present IntentionQA, a benchmark to evaluate LMs' comprehension of purchase intentions in E-commerce.
IntentionQA consists of 4,360 carefully curated problems across three difficulty levels, constructed using an automated pipeline.
Human evaluations demonstrate the high quality and low false-negative rate of our benchmark.
arXiv Detail & Related papers (2024-06-14T16:51:21Z) - Text-Based Product Matching -- Semi-Supervised Clustering Approach [9.748519919202986]
This paper aims to present a new philosophy to product matching utilizing a semi-supervised clustering approach.
We study the properties of this method by experimenting with the IDEC algorithm on the real-world dataset.
arXiv Detail & Related papers (2024-02-01T18:52:26Z) - Product Information Extraction using ChatGPT [69.12244027050454]
This paper explores the potential of ChatGPT for extracting attribute/value pairs from product descriptions.
Our results show that ChatGPT achieves a performance similar to a pre-trained language model but requires much smaller amounts of training data and computation for fine-tuning.
arXiv Detail & Related papers (2023-06-23T09:30:01Z) - A Multi-Granularity Matching Attention Network for Query Intent
Classification in E-commerce Retrieval [9.034096715927731]
This paper proposes a Multi-granularity Matching Attention Network (MMAN) for query intent classification.
MMAN contains three modules: a self-matching module, a char-level matching module, and a semantic-level matching module.
We conduct extensive offline and online A/B experiments, and the results show that the MMAN significantly outperforms the strong baselines.
arXiv Detail & Related papers (2023-03-28T10:25:17Z) - Automated Extraction of Fine-Grained Standardized Product Information
from Unstructured Multilingual Web Data [66.21317300595483]
We show how recent advances in machine learning, combined with a recently published multilingual data set, enable robust product attribute extraction.
Our models can reliably predict product attributes across online shops, languages, or both.
arXiv Detail & Related papers (2023-02-23T16:26:11Z) - Two Is Better Than One: Dual Embeddings for Complementary Product
Recommendations [2.294014185517203]
We apply a novel approach to finding complementary items by leveraging dual embedding representations for products.
Our model is effective yet simple to implement, making it a great candidate for generating complementary item recommendations at any e-commerce website.
arXiv Detail & Related papers (2022-11-28T00:58:21Z) - e-CLIP: Large-Scale Vision-Language Representation Learning in
E-commerce [9.46186546774799]
We propose a contrastive learning framework that aligns language and visual models using unlabeled raw product text and images.
We present techniques we used to train large-scale representation learning models and share solutions that address domain-specific challenges.
arXiv Detail & Related papers (2022-07-01T05:16:47Z) - Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product
Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories.
We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks.
We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z) - Automatic Validation of Textual Attribute Values in E-commerce Catalog
by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge.
It can learn transferable knowledge from a subset of categories with limited labeled data.
It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z) - Cross-Lingual Low-Resource Set-to-Description Retrieval for Global
E-Commerce [83.72476966339103]
Cross-lingual information retrieval is a new task in cross-border e-commerce.
We propose a novel cross-lingual matching network (CLMN) with the enhancement of context-dependent cross-lingual mapping.
Experimental results indicate that our proposed CLMN yields impressive results on the challenging task.
arXiv Detail & Related papers (2020-05-17T08:10:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.