Query2Prod2Vec Grounded Word Embeddings for eCommerce
- URL: http://arxiv.org/abs/2104.02061v1
- Date: Fri, 2 Apr 2021 21:32:43 GMT
- Title: Query2Prod2Vec Grounded Word Embeddings for eCommerce
- Authors: Federico Bianchi, Jacopo Tagliabue and Bingqing Yu
- Abstract summary: We present a model that grounds lexical representations for product search in product embeddings.
We leverage shopping sessions to learn the underlying space and use merchandising annotations to build lexical analogies for evaluation.
- Score: 4.137464623395377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Query2Prod2Vec, a model that grounds lexical representations for
product search in product embeddings: in our model, meaning is a mapping
between words and a latent space of products in a digital shop. We leverage
shopping sessions to learn the underlying space and use merchandising
annotations to build lexical analogies for evaluation: our experiments show
that our model is more accurate than known techniques from the NLP and IR
literature. Finally, we stress the importance of data efficiency for product
search outside of retail giants, and highlight how Query2Prod2Vec fits with
practical constraints faced by most practitioners.
Related papers
- Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models [50.370043676415875]
In smart retail applications, the large number of products and their frequent turnover necessitate reliable zero-shot object classification methods.
We introduce the MIMEX dataset, comprising 28 distinct product categories.
We benchmark the zero-shot object classification performance of state-of-the-art vision-language models (VLMs) on the proposed MIMEX dataset.
arXiv Detail & Related papers (2024-09-23T12:28:40Z) - Unifying Latent and Lexicon Representations for Effective Video-Text
Retrieval [87.69394953339238]
We propose the UNIFY framework, which learns lexicon representations to capture fine-grained semantics in video-text retrieval.
We show our framework largely outperforms previous video-text retrieval methods, with 4.8% and 8.2% Recall@1 improvement on MSR-VTT and DiDeMo respectively.
arXiv Detail & Related papers (2024-02-26T17:36:50Z) - A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties [53.177550970052174]
ProLab is a novel approach using property-level label space for creating strong interpretable segmentation models.
It uses descriptive properties grounded in common sense knowledge for supervising segmentation models.
arXiv Detail & Related papers (2023-12-21T11:43:41Z) - Language Models: A Guide for the Perplexed [51.88841610098437]
This tutorial aims to help narrow the gap between those who study language models and those who are intrigued and want to learn more.
We offer a scientific viewpoint that focuses on questions amenable to study through experimentation.
We situate language models as they are today in the context of the research that led to their development.
arXiv Detail & Related papers (2023-11-29T01:19:02Z) - Product Information Extraction using ChatGPT [69.12244027050454]
This paper explores the potential of ChatGPT for extracting attribute/value pairs from product descriptions.
Our results show that ChatGPT achieves a performance similar to a pre-trained language model but requires much smaller amounts of training data and computation for fine-tuning.
arXiv Detail & Related papers (2023-06-23T09:30:01Z) - e-CLIP: Large-Scale Vision-Language Representation Learning in
E-commerce [9.46186546774799]
We propose a contrastive learning framework that aligns language and visual models using unlabeled raw product text and images.
We present techniques we used to train large-scale representation learning models and share solutions that address domain-specific challenges.
arXiv Detail & Related papers (2022-07-01T05:16:47Z) - BERT Goes Shopping: Comparing Distributional Models for Product
Representations [4.137464623395377]
Inspired by the recent performance improvements on several NLP tasks brought by contextualized embeddings, we propose to transfer BERT-like architectures to eCommerce.
Our model -- ProdBERT -- is trained to generate representations of products through masked session modeling.
arXiv Detail & Related papers (2020-12-17T18:18:03Z) - Improving BERT Performance for Aspect-Based Sentiment Analysis [3.5493798890908104]
Aspect-Based Sentiment Analysis (ABSA) studies the consumer opinion on the market products.
It involves examining the type of sentiments as well as sentiment targets expressed in product reviews.
We show that applying the proposed models eliminates the need for further training of the BERT model.
arXiv Detail & Related papers (2020-10-22T13:52:18Z) - Shopping in the Multiverse: A Counterfactual Approach to In-Session
Attribution [6.09170287691728]
We tackle the challenge of in-session attribution for on-site search engines in eCommerce.
We phrase the problem as a causal counterfactual inference, and contrast the approach with rule-based systems.
We show how natural language queries can be effectively represented in the same space and how "search intervention" can be performed to assess causal contribution.
arXiv Detail & Related papers (2020-07-20T13:32:02Z) - A Corpus Study and Annotation Schema for Named Entity Recognition and
Relation Extraction of Business Products [68.26059718611914]
We present a corpus study, an annotation schema and associated guidelines, for the annotation of product entity and company-product relation mentions.
We find that although product mentions are often realized as noun phrases, defining their exact extent is difficult due to high boundary ambiguity.
We present a preliminary corpus of English web and social media documents annotated according to the proposed guidelines.
arXiv Detail & Related papers (2020-04-07T11:45:22Z) - Modeling Product Search Relevance in e-Commerce [7.139647051098728]
We propose a robust way of predicting relevance scores given a search query and a product.
We compare conventional information retrieval models such as BM25 and Indri with deep learning models such as word2vec, sentence2vec and paragraph2vec.
arXiv Detail & Related papers (2020-01-14T21:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.