Analyzing the Effectiveness of Listwise Reranking with Positional Invariance on Temporal Generalizability
- URL: http://arxiv.org/abs/2407.06716v3
- Date: Mon, 30 Sep 2024 10:49:26 GMT
- Title: Analyzing the Effectiveness of Listwise Reranking with Positional Invariance on Temporal Generalizability
- Authors: Soyoung Yoon, Jongyoon Kim, Seung-won Hwang,
- Abstract summary: We highlight the gap between studying retrieval performance on static knowledge documents and understanding performance in real-world environments.
Our findings demonstrate the effectiveness of a listwise reranking approach, which proficiently handles inaccuracies induced by temporal distribution shifts.
Among listwise rerankers, our findings show that ListT5 effectively mitigates the positional bias problem by adopting the Fusion-in-Decoder architecture.
- Score: 20.797306325588153
- License:
- Abstract: This working note outlines our participation in the retrieval task at CLEF 2024. We highlight the considerable gap between studying retrieval performance on static knowledge documents and understanding performance in real-world environments. Therefore, Addressing these discrepancies and measuring the temporal persistence of IR systems is crucial. By investigating the LongEval benchmark, specifically designed for such dynamic environments, our findings demonstrate the effectiveness of a listwise reranking approach, which proficiently handles inaccuracies induced by temporal distribution shifts. Among listwise rerankers, our findings show that ListT5, which effectively mitigates the positional bias problem by adopting the Fusion-in-Decoder architecture, is especially effective, and more so, as temporal drift increases, on the test-long subset.
Related papers
- Replicability Measures for Longitudinal Information Retrieval Evaluation [3.4917392789760147]
This work explores how the effectiveness measured in evolving experiments can be assessed.
The persistency of effectiveness is investigated as a replicability task.
It was found that the most effective systems are not necessarily the ones with the most persistent performance.
arXiv Detail & Related papers (2024-09-09T08:19:43Z) - Breaking the Hourglass Phenomenon of Residual Quantization: Enhancing the Upper Bound of Generative Retrieval [16.953923822238455]
Generative retrieval (GR) has emerged as a transformative paradigm in search and recommender systems.
"Hourglass" phenomenon substantially impacts the performance of RQ-SID in generative retrieval.
We propose effective solutions to mitigate this issue, thereby enhancing the effectiveness of generative retrieval in real-world E-commerce applications.
arXiv Detail & Related papers (2024-07-31T09:52:53Z) - A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems [67.52782366565658]
State-of-the-art recommender systems (RSs) depend on categorical features, which ecoded by embedding vectors, resulting in excessively large embedding tables.
Despite the prosperity of lightweight embedding-based RSs, a wide diversity is seen in evaluation protocols.
This study investigates various LERS' performance, efficiency, and cross-task transferability via a thorough benchmarking process.
arXiv Detail & Related papers (2024-06-25T07:45:00Z) - Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation [96.78845113346809]
Retrieval-augmented language models (RALMs) have shown strong performance and wide applicability in knowledge-intensive tasks.
This paper proposes SynCheck, a lightweight monitor that leverages fine-grained decoding dynamics to detect unfaithful sentences.
We also introduce FOD, a faithfulness-oriented decoding algorithm guided by beam search for long-form retrieval-augmented generation.
arXiv Detail & Related papers (2024-06-19T16:42:57Z) - ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization [52.5587113539404]
We introduce a causality-aware entropy term that effectively identifies and prioritizes actions with high potential impacts for efficient exploration.
Our proposed algorithm, ACE: Off-policy Actor-critic with Causality-aware Entropy regularization, demonstrates a substantial performance advantage across 29 diverse continuous control tasks.
arXiv Detail & Related papers (2024-02-22T13:22:06Z) - Investigating the Robustness of Sequential Recommender Systems Against
Training Data Perturbations [9.463133630647569]
We introduce Finite Rank-Biased Overlap (FRBO), an enhanced similarity tailored explicitly for finite rankings.
We empirically investigate the impact of removing items at different positions within a temporally ordered sequence.
Our results demonstrate that removing items at the end of the sequence has a statistically significant impact on performance.
arXiv Detail & Related papers (2023-07-24T23:26:46Z) - Exploring the Practicality of Generative Retrieval on Dynamic Corpora [41.223804434693875]
In this paper, we focus on Generative Retrievals (GR), which apply autoregressive language models to IR problems.
Our results on the StreamingQA benchmark demonstrate that GR is more adaptable to evolving knowledge (4-11%), robust in learning knowledge with temporal information, and efficient in terms of FLOPs (x6), indexing time (x6), and storage footprint (x4)
Our paper highlights the potential of GR for future use in practical IR systems within dynamic environments.
arXiv Detail & Related papers (2023-05-27T16:05:00Z) - Delayed Reinforcement Learning by Imitation [31.932677462399468]
We present a novel algorithm that learns how to act in a delayed environment from undelayed demonstrations.
We show that DIDA obtains high performances with a remarkable sample efficiency on a variety of tasks.
arXiv Detail & Related papers (2022-05-11T15:27:33Z) - Spatio-temporal Gait Feature with Adaptive Distance Alignment [90.5842782685509]
We try to increase the difference of gait features of different subjects from two aspects: the optimization of network structure and the refinement of extracted gait features.
Our method is proposed, it consists of Spatio-temporal Feature Extraction (SFE) and Adaptive Distance Alignment (ADA)
ADA uses a large number of unlabeled gait data in real life as a benchmark to refine the extracted-temporal features to make them have low inter-class similarity and high intra-class similarity.
arXiv Detail & Related papers (2022-03-07T13:34:00Z) - Towards Unbiased Visual Emotion Recognition via Causal Intervention [63.74095927462]
We propose a novel Emotion Recognition Network (IERN) to alleviate the negative effects brought by the dataset bias.
A series of designed tests validate the effectiveness of IERN, and experiments on three emotion benchmarks demonstrate that IERN outperforms other state-of-the-art approaches.
arXiv Detail & Related papers (2021-07-26T10:40:59Z) - Finding Action Tubes with a Sparse-to-Dense Framework [62.60742627484788]
We propose a framework that generates action tube proposals from video streams with a single forward pass in a sparse-to-dense manner.
We evaluate the efficacy of our model on the UCF101-24, JHMDB-21 and UCFSports benchmark datasets.
arXiv Detail & Related papers (2020-08-30T15:38:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.