Interpretable Methods for Identifying Product Variants
- URL: http://arxiv.org/abs/2104.05504v1
- Date: Mon, 12 Apr 2021 14:37:16 GMT
- Title: Interpretable Methods for Identifying Product Variants
- Authors: Rebecca West, Khalifeh Al Jadda, Unaiza Ahsan, Huiming Qu, Xiquan Cui
- Abstract summary: We introduce a novel approach to identifying product variants.
It combines both constrained clustering and tailored NLP techniques.
We design the algorithm to meet certain business criteria, including meeting high accuracy requirements.
- Score: 0.2589904091148018
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For e-commerce companies with large product selections, the organization and
grouping of products in meaningful ways is important for creating great
customer shopping experiences and cultivating an authoritative brand image. One
important way of grouping products is to identify a family of product variants,
where the variants are mostly the same with slight and yet distinct differences
(e.g. color or pack size). In this paper, we introduce a novel approach to
identifying product variants. It combines both constrained clustering and
tailored NLP techniques (e.g. extraction of product family name from
unstructured product title and identification of products with similar model
numbers) to achieve superior performance compared with an existing baseline
using a vanilla classification approach. In addition, we design the algorithm
to meet certain business criteria, including meeting high accuracy requirements
on a wide range of categories (e.g. appliances, decor, tools, and building
materials, etc.) as well as prioritizing the interpretability of the model to
make it accessible and understandable to all business partners.
Related papers
- Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models [50.370043676415875]
In smart retail applications, the large number of products and their frequent turnover necessitate reliable zero-shot object classification methods.
We introduce the MIMEX dataset, comprising 28 distinct product categories.
We benchmark the zero-shot object classification performance of state-of-the-art vision-language models (VLMs) on the proposed MIMEX dataset.
arXiv Detail & Related papers (2024-09-23T12:28:40Z) - Learning variant product relationship and variation attributes from e-commerce website structures [5.273938705774915]
We introduce VARM, variant relationship matcher strategy, to identify pairs of variant products in e-commerce catalogs.
We use RAG prompted generative LLMs to extract variation and common attributes amongst groups of variant products.
arXiv Detail & Related papers (2024-09-17T18:24:27Z) - Text-Based Product Matching -- Semi-Supervised Clustering Approach [9.748519919202986]
This paper aims to present a new philosophy to product matching utilizing a semi-supervised clustering approach.
We study the properties of this method by experimenting with the IDEC algorithm on the real-world dataset.
arXiv Detail & Related papers (2024-02-01T18:52:26Z) - Multimodal Prompt Learning for Product Title Generation with Extremely
Limited Labels [66.54691023795097]
We propose a prompt-based approach, i.e., the Multimodal Prompt Learning framework, to generate titles for novel products with limited labels.
We build a set of multimodal prompts from different modalities to preserve the corresponding characteristics and writing styles of novel products.
With the full labelled data for training, our method achieves state-of-the-art results.
arXiv Detail & Related papers (2023-07-05T00:40:40Z) - Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product
Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories.
We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks.
We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z) - ItemSage: Learning Product Embeddings for Shopping Recommendations at
Pinterest [60.841761065439414]
At Pinterest, we build a single set of product embeddings called ItemSage to provide relevant recommendations in all shopping use cases.
This approach has led to significant improvements in engagement and conversion metrics, while reducing both infrastructure and maintenance cost.
arXiv Detail & Related papers (2022-05-24T02:28:58Z) - Machine Learning approaches to do size based reasoning on Retail Shelf
objects to classify product variants [3.3767251810292955]
Deep learning based computer vision methods can be used to detect products on retail shelves and then classify them.
There are different sized variants of products which look exactly the same visually and the method to differentiate them is to look at their relative sizes with other products on shelves.
This makes the process of deciphering the sized based variants from each other using computer vision algorithms alone impractical.
arXiv Detail & Related papers (2021-10-07T20:29:07Z) - Product1M: Towards Weakly Supervised Instance-Level Product Retrieval
via Cross-modal Pretraining [108.86502855439774]
We investigate a more realistic setting that aims to perform weakly-supervised multi-modal instance-level product retrieval.
We contribute Product1M, one of the largest multi-modal cosmetic datasets for real-world instance-level retrieval.
We propose a novel model named Cross-modal contrAstive Product Transformer for instance-level prodUct REtrieval (CAPTURE)
arXiv Detail & Related papers (2021-07-30T12:11:24Z) - Exploiting Knowledge Graphs for Facilitating Product/Service Discovery [1.2691047660244332]
This work presents a cost-effective solution for e-commerce on the Data Web by employing an unsupervised approach for data classification.
The proposed architecture describes available products in web language OWL and stores them in a triple store.
User input specifications for certain products are matched against the available product categories to generate a knowledge graph.
arXiv Detail & Related papers (2020-10-11T10:22:10Z) - Automatic Validation of Textual Attribute Values in E-commerce Catalog
by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge.
It can learn transferable knowledge from a subset of categories with limited labeled data.
It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.