Towards Robust Neural Retrieval Models with Synthetic Pre-Training
- URL: http://arxiv.org/abs/2104.07800v1
- Date: Thu, 15 Apr 2021 22:12:01 GMT
- Title: Towards Robust Neural Retrieval Models with Synthetic Pre-Training
- Authors: Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz,
Vittorio Castelli, Heng Ji, Avirup Sil
- Abstract summary: We show that synthetic training examples generated using a sequence-to-sequence generator can be effective towards this goal.
In our experiments, pre-training with synthetic examples improves retrieval performance in both in-domain and out-of-domain evaluation on five different test sets.
- Score: 28.547347789198096
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work has shown that commonly available machine reading comprehension
(MRC) datasets can be used to train high-performance neural information
retrieval (IR) systems. However, the evaluation of neural IR has so far been
limited to standard supervised learning settings, where they have outperformed
traditional term matching baselines. We conduct in-domain and out-of-domain
evaluations of neural IR, and seek to improve its robustness across different
scenarios, including zero-shot settings. We show that synthetic training
examples generated using a sequence-to-sequence generator can be effective
towards this goal: in our experiments, pre-training with synthetic examples
improves retrieval performance in both in-domain and out-of-domain evaluation
on five different test sets.
Related papers
- ChaosMining: A Benchmark to Evaluate Post-Hoc Local Attribution Methods in Low SNR Environments [14.284728947052743]
In this study, we examine the efficacy of post-hoc local attribution methods in identifying features with predictive power from irrelevant ones in domains characterized by a low signal-to-noise ratio (SNR)
Our experiments highlight its strengths in prediction and feature selection, alongside limitations in scalability.
arXiv Detail & Related papers (2024-06-17T23:39:29Z) - Neural Network-Based Score Estimation in Diffusion Models: Optimization
and Generalization [12.812942188697326]
Diffusion models have emerged as a powerful tool rivaling GANs in generating high-quality samples with improved fidelity, flexibility, and robustness.
A key component of these models is to learn the score function through score matching.
Despite empirical success on various tasks, it remains unclear whether gradient-based algorithms can learn the score function with a provable accuracy.
arXiv Detail & Related papers (2024-01-28T08:13:56Z) - Harnessing Orthogonality to Train Low-Rank Neural Networks [0.07538606213726905]
This study explores the learning dynamics of neural networks by analyzing the singular value decomposition (SVD) of their weights throughout training.
We introduce Orthogonality-Informed Adaptive Low-Rank (OIALR) training, a novel training method exploiting the intrinsic orthogonality of neural networks.
arXiv Detail & Related papers (2024-01-16T17:07:22Z) - Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries.
Experimental results show that our method improves consistently over existing methods.
Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z) - Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network.
We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint.
Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z) - New Machine Learning Techniques for Simulation-Based Inference:
InferoStatic Nets, Kernel Score Estimation, and Kernel Likelihood Ratio
Estimation [4.415977307120616]
We propose a machine-learning approach to model the score and likelihood ratio estimators in cases when the probability density can be sampled but not computed directly.
We introduce new strategies, respectively called Kernel Score Estimation (KSE) and Kernel Likelihood Ratio Estimation (KLRE) to learn the score and the likelihood ratio functions from simulated data.
arXiv Detail & Related papers (2022-10-04T15:22:56Z) - Improving Music Performance Assessment with Contrastive Learning [78.8942067357231]
This study investigates contrastive learning as a potential method to improve existing MPA systems.
We introduce a weighted contrastive loss suitable for regression tasks applied to a convolutional neural network.
Our results show that contrastive-based methods are able to match and exceed SoTA performance for MPA regression tasks.
arXiv Detail & Related papers (2021-08-03T19:24:25Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Multi-Sample Online Learning for Spiking Neural Networks based on
Generalized Expectation Maximization [42.125394498649015]
Spiking Neural Networks (SNNs) capture some of the efficiency of biological brains by processing through binary neural dynamic activations.
This paper proposes to leverage multiple compartments that sample independent spiking signals while sharing synaptic weights.
The key idea is to use these signals to obtain more accurate statistical estimates of the log-likelihood training criterion, as well as of its gradient.
arXiv Detail & Related papers (2021-02-05T16:39:42Z) - CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural
Summarization Systems [121.78477833009671]
We investigate the performance of different summarization models under a cross-dataset setting.
A comprehensive study of 11 representative summarization systems on 5 datasets from different domains reveals the effect of model architectures and generation ways.
arXiv Detail & Related papers (2020-10-11T02:19:15Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.