Image Score: Learning and Evaluating Human Preferences for Mercari Search
- URL: http://arxiv.org/abs/2408.11349v1
- Date: Wed, 21 Aug 2024 05:30:06 GMT
- Title: Image Score: Learning and Evaluating Human Preferences for Mercari Search
- Authors: Chingis Oinar, Miao Cao, Shanshan Fu,
- Abstract summary: Large Language Models (LLMs) are being actively studied and used for data labelling tasks.
We propose a cost-efficient LLM-driven approach for assessing and predicting image quality in e-commerce settings.
We show that our LLM-produced labels correlate with user behavior on Mercari.
- Score: 2.1555050262085027
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mercari is the largest C2C e-commerce marketplace in Japan, having more than 20 million active monthly users. Search being the fundamental way to discover desired items, we have always had a substantial amount of data with implicit feedback. Although we actively take advantage of that to provide the best service for our users, the correlation of implicit feedback for such tasks as image quality assessment is not trivial. Many traditional lines of research in Machine Learning (ML) are similarly motivated by the insatiable appetite of Deep Learning (DL) models for well-labelled training data. Weak supervision is about leveraging higher-level and/or noisier supervision over unlabeled data. Large Language Models (LLMs) are being actively studied and used for data labelling tasks. We present how we leverage a Chain-of-Thought (CoT) to enable LLM to produce image aesthetics labels that correlate well with human behavior in e-commerce settings. Leveraging LLMs is more cost-effective compared to explicit human judgment, while significantly improving the explainability of deep image quality evaluation which is highly important for customer journey optimization at Mercari. We propose a cost-efficient LLM-driven approach for assessing and predicting image quality in e-commerce settings, which is very convenient for proof-of-concept testing. We show that our LLM-produced labels correlate with user behavior on Mercari. Finally, we show our results from an online experimentation, where we achieved a significant growth in sales on the web platform.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - Learning to Predict Usage Options of Product Reviews with LLM-Generated Labels [14.006486214852444]
We propose a method of using LLMs as few-shot learners for annotating data in a complex natural language task.
Learning a custom model offers individual control over energy efficiency and privacy measures.
We find that the quality of the resulting data exceeds the level attained by third-party vendor services.
arXiv Detail & Related papers (2024-10-16T11:34:33Z) - IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce [71.37481473399559]
In this paper, we present IntentionQA, a benchmark to evaluate LMs' comprehension of purchase intentions in E-commerce.
IntentionQA consists of 4,360 carefully curated problems across three difficulty levels, constructed using an automated pipeline.
Human evaluations demonstrate the high quality and low false-negative rate of our benchmark.
arXiv Detail & Related papers (2024-06-14T16:51:21Z) - Show, Don't Tell: Aligning Language Models with Demonstrated Feedback [54.10302745921713]
Demonstration ITerated Task Optimization (DITTO) directly aligns language model outputs to a user's demonstrated behaviors.
We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts.
arXiv Detail & Related papers (2024-06-02T23:13:56Z) - Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process.
We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals.
The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z) - Automating Customer Needs Analysis: A Comparative Study of Large Language Models in the Travel Industry [2.4244694855867275]
Large Language Models (LLMs) have emerged as powerful tools for extracting valuable insights from vast amounts of textual data.
In this study, we conduct a comparative analysis of LLMs for the extraction of travel customer needs from TripAdvisor posts.
Our findings highlight the efficacy of opensource LLMs, particularly Mistral 7B, in achieving comparable performance to larger closed models.
arXiv Detail & Related papers (2024-04-27T18:28:10Z) - From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets.
Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z) - Knowledge Graph Completion Models are Few-shot Learners: An Empirical
Study of Relation Labeling in E-commerce with LLMs [16.700089674927348]
Large Language Models (LLMs) have shown surprising results in numerous natural language processing tasks.
This paper investigates their powerful learning capabilities in natural language and effectiveness in predicting relations between product types with limited labeled data.
Our results show that LLMs significantly outperform existing KG completion models in relation labeling for e-commerce KGs and exhibit performance strong enough to replace human labeling.
arXiv Detail & Related papers (2023-05-17T00:08:36Z) - What do LLMs Know about Financial Markets? A Case Study on Reddit Market
Sentiment Analysis [15.195505464654493]
Market sentiment analysis on social media content requires knowledge of both financial markets and social media jargon.
Our pipeline generates weak financial sentiment labels for Reddit posts with a large language model (LLM)
With only a handful of prompts, the final model performs on par with existing supervised models.
arXiv Detail & Related papers (2022-12-21T19:11:19Z) - Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product
Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories.
We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks.
We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z) - Self-supervised Learning for Large-scale Item Recommendations [18.19202958502061]
Large scale recommender models find most relevant items from huge catalogs.
With millions to billions of items in the corpus, users tend to provide feedback for a very small set of them.
We propose a multi-task self-supervised learning framework for large-scale item recommendations.
arXiv Detail & Related papers (2020-07-25T06:21:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.