Improving Out-of-Distribution Generalization of Neural Rerankers with
Contextualized Late Interaction
- URL: http://arxiv.org/abs/2302.06589v1
- Date: Mon, 13 Feb 2023 18:42:17 GMT
- Title: Improving Out-of-Distribution Generalization of Neural Rerankers with
Contextualized Late Interaction
- Authors: Xinyu Zhang, Minghan Li, and Jimmy Lin
- Abstract summary: Late interaction, the simplest form of multi-vector, is also helpful to neural rerankers that only use the [] vector to compute the similarity score.
We show that the finding is consistent across different model sizes and first-stage retrievers of diverse natures.
- Score: 52.63663547523033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent progress in information retrieval finds that embedding query and
document representation into multi-vector yields a robust bi-encoder retriever
on out-of-distribution datasets. In this paper, we explore whether late
interaction, the simplest form of multi-vector, is also helpful to neural
rerankers that only use the [CLS] vector to compute the similarity score.
Although intuitively, the attention mechanism of rerankers at the previous
layers already gathers the token-level information, we find adding late
interaction still brings an extra 5% improvement in average on
out-of-distribution datasets, with little increase in latency and no
degradation in in-domain effectiveness. Through extensive experiments and
analysis, we show that the finding is consistent across different model sizes
and first-stage retrievers of diverse natures and that the improvement is more
prominent on longer queries.
Related papers
- Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - ReFIT: Relevance Feedback from a Reranker during Inference [109.33278799999582]
Retrieve-and-rerank is a prevalent framework in neural information retrieval.
We propose to leverage the reranker to improve recall by making it provide relevance feedback to the retriever at inference time.
arXiv Detail & Related papers (2023-05-19T15:30:33Z) - Agent-State Construction with Auxiliary Inputs [16.79847469127811]
We present a series of examples illustrating the different ways of using auxiliary inputs for reinforcement learning.
We show that these auxiliary inputs can be used to discriminate between observations that would otherwise be aliased.
This approach is complementary to state-of-the-art methods such as recurrent neural networks and truncated back-propagation.
arXiv Detail & Related papers (2022-11-15T00:18:14Z) - Generating Sparse Counterfactual Explanations For Multivariate Time
Series [0.5161531917413706]
We propose a generative adversarial network (GAN) architecture that generates SPARse Counterfactual Explanations for multivariate time series.
Our approach provides a custom sparsity layer and regularizes the counterfactual loss function in terms of similarity, sparsity, and smoothness of trajectories.
We evaluate our approach on real-world human motion datasets as well as a synthetic time series interpretability benchmark.
arXiv Detail & Related papers (2022-06-02T08:47:06Z) - CCLF: A Contrastive-Curiosity-Driven Learning Framework for
Sample-Efficient Reinforcement Learning [56.20123080771364]
We develop a model-agnostic Contrastive-Curiosity-Driven Learning Framework (CCLF) for reinforcement learning.
CCLF fully exploit sample importance and improve learning efficiency in a self-supervised manner.
We evaluate this approach on the DeepMind Control Suite, Atari, and MiniGrid benchmarks.
arXiv Detail & Related papers (2022-05-02T14:42:05Z) - Last Layer Re-Training is Sufficient for Robustness to Spurious
Correlations [51.552870594221865]
We show that last layer retraining can match or outperform state-of-the-art approaches on spurious correlation benchmarks.
We also show that last layer retraining on large ImageNet-trained models can significantly reduce reliance on background and texture information.
arXiv Detail & Related papers (2022-04-06T16:55:41Z) - ColBERTv2: Effective and Efficient Retrieval via Lightweight Late
Interaction [15.336103841957328]
ColBERTv2 is a retriever that couples an aggressive residual compression mechanism with a denoised supervision strategy.
We evaluate ColBERTv2 across a range of benchmarks, establishing state-of-the-art quality within and outside the training domain.
arXiv Detail & Related papers (2021-12-02T18:38:50Z) - Recurrent Feedback Improves Recognition of Partially Occluded Objects [1.452875650827562]
We investigate if and how artificial neural networks also benefit from recurrence.
We find that classification accuracy is significantly higher for recurrent models when compared to feedforward models of matched parametric complexity.
arXiv Detail & Related papers (2021-04-21T16:18:34Z) - Representation Learning for Sequence Data with Deep Autoencoding
Predictive Components [96.42805872177067]
We propose a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space.
We encourage this latent structure by maximizing an estimate of predictive information of latent feature sequences, which is the mutual information between past and future windows at each time step.
We demonstrate that our method recovers the latent space of noisy dynamical systems, extracts predictive features for forecasting tasks, and improves automatic speech recognition when used to pretrain the encoder on large amounts of unlabeled data.
arXiv Detail & Related papers (2020-10-07T03:34:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.