Overview of the TREC 2023 Product Product Search Track
- URL: http://arxiv.org/abs/2311.07861v2
- Date: Wed, 15 Nov 2023 12:22:35 GMT
- Title: Overview of the TREC 2023 Product Product Search Track
- Authors: Daniel Campos, Surya Kallumadi, Corby Rosset, Cheng Xiang Zhai,
Alessandro Magnani
- Abstract summary: This is the first year of the TREC Product search track.
The focus was the creation of a reusable collection.
We leverage the new product search corpus, which includes contextual metadata.
- Score: 70.56592126043546
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: This is the first year of the TREC Product search track. The focus this year
was the creation of a reusable collection and evaluation of the impact of the
use of metadata and multi-modal data on retrieval accuracy. This year we
leverage the new product search corpus, which includes contextual metadata. Our
analysis shows that in the product search domain, traditional retrieval systems
are highly effective and commonly outperform general-purpose pretrained
embedding models. Our analysis also evaluates the impact of using simplified
and metadata-enhanced collections, finding no clear trend in the impact of the
expanded collection. We also see some surprising outcomes; despite their
widespread adoption and competitive performance on other tasks, we find
single-stage dense retrieval runs can commonly be noncompetitive or generate
low-quality results both in the zero-shot and fine-tuned domain.
Related papers
- A CLIP-Powered Framework for Robust and Generalizable Data Selection [51.46695086779598]
Real-world datasets often contain redundant and noisy data, imposing a negative impact on training efficiency and model performance.
Data selection has shown promise in identifying the most representative samples from the entire dataset.
We propose a novel CLIP-powered data selection framework that leverages multimodal information for more robust and generalizable sample selection.
arXiv Detail & Related papers (2024-10-15T03:00:58Z) - DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery.
Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering.
Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z) - ACLSum: A New Dataset for Aspect-based Summarization of Scientific
Publications [10.529898520273063]
ACLSum is a novel summarization dataset carefully crafted and evaluated by domain experts.
In contrast to previous datasets, ACLSum facilitates multi-aspect summarization of scientific papers.
arXiv Detail & Related papers (2024-03-08T13:32:01Z) - Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product
Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories.
We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks.
We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z) - An Informative Tracking Benchmark [133.0931262969931]
We develop a small and informative tracking benchmark (ITB) with 7% out of 1.2 M frames of existing and newly collected datasets.
We select the most informative sequences from existing benchmarks taking into account 1) challenging level, 2) discriminative strength, 3) and density of appearance variations.
By analyzing the results of 15 state-of-the-art trackers re-trained on the same data, we determine the effective methods for robust tracking under each scenario.
arXiv Detail & Related papers (2021-12-13T07:56:16Z) - BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information
Retrieval Models [41.45240621979654]
We introduce BEIR, a heterogeneous benchmark for information retrieval.
We study the effectiveness of nine state-of-the-art retrieval models in a zero-shot evaluation setup.
Dense-retrieval models are computationally more efficient but often underperform other approaches.
arXiv Detail & Related papers (2021-04-17T23:29:55Z) - The Multimodal Sentiment Analysis in Car Reviews (MuSe-CaR) Dataset:
Collection, Insights and Improvements [14.707930573950787]
We present MuSe-CaR, a first of its kind multimodal dataset.
The data is publicly available as it recently served as the testing bed for the 1st Multimodal Sentiment Analysis Challenge.
arXiv Detail & Related papers (2021-01-15T10:40:37Z) - Novel Human-Object Interaction Detection via Adversarial Domain
Generalization [103.55143362926388]
We study the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios.
The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations.
We propose a unified framework of adversarial domain generalization to learn object-invariant features for predicate prediction.
arXiv Detail & Related papers (2020-05-22T22:02:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.