Shopping Queries Image Dataset (SQID): An Image-Enriched ESCI Dataset for Exploring Multimodal Learning in Product Search
- URL: http://arxiv.org/abs/2405.15190v1
- Date: Fri, 24 May 2024 03:50:31 GMT
- Title: Shopping Queries Image Dataset (SQID): An Image-Enriched ESCI Dataset for Exploring Multimodal Learning in Product Search
- Authors: Marie Al Ghossein, Ching-Wei Chen, Jason Tang,
- Abstract summary: Shopping Queries Image dataset (SQID) is an extension of the Amazon Shopping Queries dataset enriched with image information associated with 190,000 products.
By integrating visual information, SQID facilitates research around multimodal learning techniques for improving product search and ranking.
We provide experimental results leveraging SQID and pretrained models, showing the value of using multimodal data for search and ranking.
- Score: 0.6106642353538779
- License:
- Abstract: Recent advances in the fields of Information Retrieval and Machine Learning have focused on improving the performance of search engines to enhance the user experience, especially in the world of online shopping. The focus has thus been on leveraging cutting-edge learning techniques and relying on large enriched datasets. This paper introduces the Shopping Queries Image Dataset (SQID), an extension of the Amazon Shopping Queries Dataset enriched with image information associated with 190,000 products. By integrating visual information, SQID facilitates research around multimodal learning techniques that can take into account both textual and visual information for improving product search and ranking. We also provide experimental results leveraging SQID and pretrained models, showing the value of using multimodal data for search and ranking. SQID is available at: https://github.com/Crossing-Minds/shopping-queries-image-dataset.
Related papers
- Exploring Query Understanding for Amazon Product Search [62.53282527112405]
We study how query understanding-based ranking features influence the ranking process.
We propose a query understanding-based multi-task learning framework for ranking.
We present our studies and investigations using the real-world system on Amazon Search.
arXiv Detail & Related papers (2024-08-05T03:33:11Z) - MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels [95.48844474720798]
We introduce MS MARCO Web Search, the first large-scale information-rich web dataset.
This dataset mimics real-world web document and query distribution.
MS MARCO Web Search offers a retrieval benchmark with three web retrieval challenge tasks.
arXiv Detail & Related papers (2024-05-13T07:46:44Z) - Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal
Sponsored Search [27.42717207107]
Cross-Modal sponsored search displays multi-modal advertisements (ads) when consumers look for desired products by natural language queries in search engines.
The ability to align ads-specific information in both images and texts is crucial for accurate and flexible sponsored search.
We propose a simple alignment network for explicitly mapping fine-grained visual parts in ads images to the corresponding text.
arXiv Detail & Related papers (2023-09-28T03:43:57Z) - End-to-end Knowledge Retrieval with Multi-modal Queries [50.01264794081951]
ReMuQ requires a system to retrieve knowledge from a large corpus by integrating contents from both text and image queries.
We introduce a retriever model ReViz'' that can directly process input text and images to retrieve relevant knowledge in an end-to-end fashion.
We demonstrate superior performance in retrieval on two datasets under zero-shot settings.
arXiv Detail & Related papers (2023-06-01T08:04:12Z) - Visually Similar Products Retrieval for Shopsy [0.0]
We design a visual search system for reseller commerce using a multi-task learning approach.
Our model consists of three different tasks: attribute classification, triplet ranking and variational autoencoder (VAE)
arXiv Detail & Related papers (2022-10-10T10:59:18Z) - Shopping Queries Dataset: A Large-Scale ESCI Benchmark for Improving
Product Search [26.772851310517954]
This paper introduces the "Shopping Queries dataset", a large dataset of difficult Amazon search queries and results.
The dataset contains around 130 thousand unique queries and 2.6 million manually labeled (product) relevance judgements.
The dataset is being used in one of the KDDCup'22 challenges.
arXiv Detail & Related papers (2022-06-14T04:25:26Z) - Progressive Learning for Image Retrieval with Hybrid-Modality Queries [48.79599320198615]
Image retrieval with hybrid-modality queries, also known as composing text and image for image retrieval (CTI-IR)
We decompose the CTI-IR task into a three-stage learning problem to progressively learn the complex knowledge for image retrieval with hybrid-modality queries.
Our proposed model significantly outperforms state-of-the-art methods in the mean of Recall@K by 24.9% and 9.5% on the Fashion-IQ and Shoes benchmark datasets respectively.
arXiv Detail & Related papers (2022-04-24T08:10:06Z) - Single-Modal Entropy based Active Learning for Visual Question Answering [75.1682163844354]
We address Active Learning in the multi-modal setting of Visual Question Answering (VQA)
In light of the multi-modal inputs, image and question, we propose a novel method for effective sample acquisition.
Our novel idea is simple to implement, cost-efficient, and readily adaptable to other multi-modal tasks.
arXiv Detail & Related papers (2021-10-21T05:38:45Z) - Exposing Query Identification for Search Transparency [69.06545074617685]
We explore the feasibility of approximate exposing query identification (EQI) as a retrieval task by reversing the role of queries and documents in two classes of search systems.
We derive an evaluation metric to measure the quality of a ranking of exposing queries, as well as conducting an empirical analysis focusing on various practical aspects of approximate EQI.
arXiv Detail & Related papers (2021-10-14T20:19:27Z) - eProduct: A Million-Scale Visual Search Benchmark to Address Product
Recognition Challenges [8.204924070199866]
eProduct is a benchmark dataset for training and evaluation on various visual search solutions in a real-world setting.
We present eProduct as a training set and an evaluation set, where the training set contains 1.3M+ listing images with titles and hierarchical category labels, for model development.
We will present eProduct's construction steps, provide analysis about its diversity and cover the performance of baseline models trained on it.
arXiv Detail & Related papers (2021-07-13T05:28:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.