Unposed: Unsupervised Pose Estimation based Product Image
Recommendations
- URL: http://arxiv.org/abs/2301.07879v1
- Date: Thu, 19 Jan 2023 05:02:55 GMT
- Title: Unposed: Unsupervised Pose Estimation based Product Image
Recommendations
- Authors: Saurabh Sharma, Faizan Ahemad
- Abstract summary: We propose a Human Pose Detection based unsupervised method to scan the image set of a product for the missing ones.
The unsupervised approach suggests a fair approach to sellers based on product and category irrespective of any biases.
We surveyed 200 products manually, a large fraction of which had at least 1 repeated image or missing variant, and sampled 3K products(20K images) of which a significant proportion had scope for adding many image variants.
- Score: 4.467248776406006
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Product images are the most impressing medium of customer interaction on the
product detail pages of e-commerce websites. Millions of products are onboarded
on to webstore catalogues daily and maintaining a high quality bar for a
product's set of images is a problem at scale. Grouping products by categories,
clothing is a very high volume and high velocity category and thus deserves its
own attention. Given the scale it is challenging to monitor the completeness of
image set, which adequately details the product for the consumers, which in
turn often leads to a poor customer experience and thus customer drop off.
To supervise the quality and completeness of the images in the product pages
for these product types and suggest improvements, we propose a Human Pose
Detection based unsupervised method to scan the image set of a product for the
missing ones. The unsupervised approach suggests a fair approach to sellers
based on product and category irrespective of any biases. We first create a
reference image set of popular products with wholesome imageset. Then we create
clusters of images to label most desirable poses to form the classes for the
reference set from these ideal products set. Further, for all test products we
scan the images for all desired pose classes w.r.t. reference set poses,
determine the missing ones and sort them in the order of potential impact.
These missing poses can further be used by the sellers to add enriched product
listing image. We gathered data from popular online webstore and surveyed ~200
products manually, a large fraction of which had at least 1 repeated image or
missing variant, and sampled 3K products(~20K images) of which a significant
proportion had scope for adding many image variants as compared to high rated
products which had more than double image variants, indicating that our model
can potentially be used on a large scale.
Related papers
- Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models [50.370043676415875]
In smart retail applications, the large number of products and their frequent turnover necessitate reliable zero-shot object classification methods.
We introduce the MIMEX dataset, comprising 28 distinct product categories.
We benchmark the zero-shot object classification performance of state-of-the-art vision-language models (VLMs) on the proposed MIMEX dataset.
arXiv Detail & Related papers (2024-09-23T12:28:40Z) - Transformer-empowered Multi-modal Item Embedding for Enhanced Image
Search in E-Commerce [20.921870288665627]
Multi-modal Item Embedding Model (MIEM) is capable of utilizing both textual information and multiple images about a product to construct meaningful product features.
MIEM has become an integral part of the Shopee image search platform.
arXiv Detail & Related papers (2023-11-29T08:09:50Z) - Behavior Optimized Image Generation [69.9906601767728]
We propose BoigLLM, which understands both image content and user behavior.
We show that BoigLLM outperforms 13x larger models such as GPT-3.5 and GPT-4 in this task.
We release BoigBench, a benchmark dataset containing 168 million enterprise tweets with their media, brand names, time of post, and total likes.
arXiv Detail & Related papers (2023-11-18T07:07:38Z) - Product Review Image Ranking for Fashion E-commerce [0.0]
We train our network to rank bad-quality images lower than high-quality ones.
Our proposed method outperforms the baseline models on two metrics, namely correlation coefficient, and accuracy, by substantial margins.
arXiv Detail & Related papers (2023-08-10T07:09:13Z) - Automatic Generation of Product-Image Sequence in E-commerce [46.06263129000091]
Multi-modality Unified Imagesequence (MUIsC) is able to simultaneously detect all categories through learning rule violations.
By Dec 2021, our AGPIS framework has generated high-standard images for about 1.5 million products and achieves 13.6% in reject rate.
arXiv Detail & Related papers (2022-06-26T23:38:42Z) - Weakly Supervised High-Fidelity Clothing Model Generation [67.32235668920192]
We propose a cheap yet scalable weakly-supervised method called Deep Generative Projection (DGP) to address this specific scenario.
We show that projecting the rough alignment of clothing and body onto the StyleGAN space can yield photo-realistic wearing results.
arXiv Detail & Related papers (2021-12-14T07:15:15Z) - An Automatic Image Content Retrieval Method for better Mobile Device
Display User Experiences [91.3755431537592]
A new mobile application for image content retrieval and classification for mobile device display is proposed.
The application was run on thousands of pictures and showed encouraging results towards a better user visual experience with mobile displays.
arXiv Detail & Related papers (2021-08-26T23:44:34Z) - eProduct: A Million-Scale Visual Search Benchmark to Address Product
Recognition Challenges [8.204924070199866]
eProduct is a benchmark dataset for training and evaluation on various visual search solutions in a real-world setting.
We present eProduct as a training set and an evaluation set, where the training set contains 1.3M+ listing images with titles and hierarchical category labels, for model development.
We will present eProduct's construction steps, provide analysis about its diversity and cover the performance of baseline models trained on it.
arXiv Detail & Related papers (2021-07-13T05:28:34Z) - Vision-based Price Suggestion for Online Second-hand Items [40.42940050851797]
We present a vision-based price suggestion system for the online second-hand item shopping platform.
The goal of vision-based price suggestion is to help sellers set effective prices for their second-hand listings with the images uploaded to the online platforms.
arXiv Detail & Related papers (2020-12-10T22:56:29Z) - Generating Person Images with Appearance-aware Pose Stylizer [66.44220388377596]
We present a novel end-to-end framework to generate realistic person images based on given person poses and appearances.
The core of our framework is a novel generator called Appearance-aware Pose Stylizer (APS) which generates human images by coupling the target pose with the conditioned person appearance progressively.
arXiv Detail & Related papers (2020-07-17T15:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.