Related papers: Online Distillation for Pseudo-Relevance Feedback

Online Distillation for Pseudo-Relevance Feedback

URL: http://arxiv.org/abs/2306.09657v1
Date: Fri, 16 Jun 2023 07:26:33 GMT
Title: Online Distillation for Pseudo-Relevance Feedback
Authors: Sean MacAvaney, Xi Wang
Abstract summary: We investigate whether a model for a specific query can be effectively distilled from neural re-ranking results. We find that a lexical model distilled online can reasonably replicate the re-ranking of a neural model. More importantly, these models can be used as queries that execute efficiently on indexes.
Score: 16.523925354318983
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Model distillation has emerged as a prominent technique to improve neural search models. To date, distillation taken an offline approach, wherein a new neural model is trained to predict relevance scores between arbitrary queries and documents. In this paper, we explore a departure from this offline distillation strategy by investigating whether a model for a specific query can be effectively distilled from neural re-ranking results (i.e., distilling in an online setting). Indeed, we find that a lexical model distilled online can reasonably replicate the re-ranking of a neural model. More importantly, these models can be used as queries that execute efficiently on indexes. This second retrieval stage can enrich the pool of documents for re-ranking by identifying documents that were missed in the first retrieval stage. Empirically, we show that this approach performs favourably when compared with established pseudo relevance feedback techniques, dense retrieval methods, and sparse-dense ensemble "hybrid" approaches.

Related papers

Retrieval Augmented Anomaly Detection (RAAD): Nimble Model Adjustment Without Retraining [3.037546128667634]
We introduce Retrieval Augmented Anomaly Detection, a novel method taking inspiration from Retrieval Augmented Generation. Human annotated examples are sent to a vector store, which can modify model outputs on the very next processed batch for model inference.
arXiv Detail & Related papers (2025-02-26T20:17:16Z)
Exploring the potential of prototype-based soft-labels data distillation for imbalanced data classification [0.0]
Main goal is to push further the performance of prototype-based soft-labels distillation in terms of classification accuracy. Experimental studies trace the capability of the method to distill the data, but also the opportunity to act as an augmentation method.
arXiv Detail & Related papers (2024-03-25T19:15:19Z)
Towards a theory of model distillation [0.0]
Distillation is the task of replacing a complicated machine learning model with a simpler model that approximates the original. We show how to efficiently distill neural networks into succinct, explicit decision tree representations. We prove that distillation can be much cheaper than learning from scratch, and make progress on characterizing its complexity.
arXiv Detail & Related papers (2024-03-14T02:42:19Z)
Distilled Pruning: Using Synthetic Data to Win the Lottery [2.4366811507669124]
This work introduces a novel approach to pruning deep learning models by using distilled data. Our approach can find sparse, trainableworks up to 5x faster than Iterative Magnitude Pruning at comparable sparsity on CIFAR-10. The experimental results highlight the potential of using distilled data for resource-efficient neural network pruning, model compression, and neural architecture search.
arXiv Detail & Related papers (2023-07-07T03:07:28Z)
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images. Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few. We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z)
Efficient Learning of Accurate Surrogates for Simulations of Complex Systems [0.0]
We introduce an online learning method empowered by sampling-driven sampling. It ensures that all turning points on the model response surface are included in the training data. We apply our method to simulations of nuclear matter to demonstrate that highly accurate surrogates can be reliably auto-generated.
arXiv Detail & Related papers (2022-07-11T20:51:11Z)
Improving Passage Retrieval with Zero-Shot Question Generation [109.11542468380331]
We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage.
arXiv Detail & Related papers (2022-04-15T14:51:41Z)
Learning to Generate Synthetic Training Data using Gradient Matching and Implicit Differentiation [77.34726150561087]
This article explores various data distillation techniques that can reduce the amount of data required to successfully train deep networks. Inspired by recent ideas, we suggest new data distillation techniques based on generative teaching networks, gradient matching, and the Implicit Function Theorem.
arXiv Detail & Related papers (2022-03-16T11:45:32Z)
Contrastive Model Inversion for Data-Free Knowledge Distillation [60.08025054715192]
We propose Contrastive Model Inversion, where the data diversity is explicitly modeled as an optimizable objective. Our main observation is that, under the constraint of the same amount of data, higher data diversity usually indicates stronger instance discrimination. Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet demonstrate that CMI achieves significantly superior performance when the generated data are used for knowledge distillation.
arXiv Detail & Related papers (2021-05-18T15:13:00Z)
Distilling Interpretable Models into Human-Readable Code [71.11328360614479]
Human-readability is an important and desirable standard for machine-learned model interpretability. We propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code. We describe a piecewise-linear curve-fitting algorithm that produces high-quality results efficiently and reliably across a broad range of use cases.
arXiv Detail & Related papers (2021-01-21T01:46:36Z)
Automatic Recall Machines: Internal Replay, Continual Learning and the Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity. We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective. Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z)
Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the Lorenz 96 model [0.0]
The method consists in applying iteratively a data assimilation step, here an ensemble Kalman filter, and a neural network. Data assimilation is used to optimally combine a surrogate model with sparse data. The output analysis is spatially complete and is used as a training set by the neural network to update the surrogate model. Numerical experiments have been carried out using the chaotic 40-variables Lorenz 96 model, proving both convergence and statistical skill of the proposed hybrid approach.
arXiv Detail & Related papers (2020-01-06T12:26:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.