PARENTing via Model-Agnostic Reinforcement Learning to Correct
Pathological Behaviors in Data-to-Text Generation
- URL: http://arxiv.org/abs/2010.10866v2
- Date: Thu, 22 Oct 2020 13:00:20 GMT
- Title: PARENTing via Model-Agnostic Reinforcement Learning to Correct
Pathological Behaviors in Data-to-Text Generation
- Authors: Cl\'ement Rebuffel, Laure Soulier, Geoffrey Scoutheeten, Patrick
Gallinari
- Abstract summary: We show that a model-agnostic framework relying on the recently introduced PARENT metric is efficient at reducing both hallucinations and omissions.
Evaluations on the widely used WikiBIO and WebNLG benchmarks demonstrate the effectiveness of this framework compared to state-of-the-art models.
- Score: 11.687228500584082
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In language generation models conditioned by structured data, the classical
training via maximum likelihood almost always leads models to pick up on
dataset divergence (i.e., hallucinations or omissions), and to incorporate them
erroneously in their own generations at inference. In this work, we build ontop
of previous Reinforcement Learning based approaches and show that a
model-agnostic framework relying on the recently introduced PARENT metric is
efficient at reducing both hallucinations and omissions. Evaluations on the
widely used WikiBIO and WebNLG benchmarks demonstrate the effectiveness of this
framework compared to state-of-the-art models.
Related papers
- Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models [19.015202590038996]
We evaluate the factuality of different models tuned by various preference learning algorithms.
We propose textbfAPEFT (textbfAtomic textbfPreference textbfEnhanced textbfFactuality textbfTuning) to enhance model's awareness of factuality.
arXiv Detail & Related papers (2024-06-18T09:07:30Z) - Collaborative decoding of critical tokens for boosting factuality of
large language models [57.504894664689]
Finetuned and aligned models show improved abilities of instruction following and safe generation.
The common practice of using sampling during generation also increases chances of hallucination.
We introduce a collaborative decoding framework to harness the high factuality within pretrained models through the concept of critical tokens.
arXiv Detail & Related papers (2024-02-28T01:53:37Z) - Has Your Pretrained Model Improved? A Multi-head Posterior Based
Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models.
We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models.
Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Improving Zero-Shot Detection of Low Prevalence Chest Pathologies using
Domain Pre-trained Language Models [0.9049664874474734]
We evaluate the performance of zero-shot classification models with domain-specific pre-training for detecting low-prevalence pathologies.
Even though replacing the weights of the original CLIP-BERT degrades model performance on commonly found pathologies, we show that pre-trained text towers perform exceptionally better on low-prevalence diseases.
arXiv Detail & Related papers (2023-06-13T06:26:54Z) - Retrieval-Enhanced Contrastive Vision-Text Models [61.783728119255365]
We propose to equip vision-text models with the ability to refine their embedding with cross-modal retrieved information from a memory at inference time.
Remarkably, we show that this can be done with a light-weight, single-layer, fusion transformer on top of a frozen CLIP.
Our experiments validate that our retrieval-enhanced contrastive (RECO) training improves CLIP performance substantially on several challenging fine-grained tasks.
arXiv Detail & Related papers (2023-06-12T15:52:02Z) - On the Compositional Generalization Gap of In-Context Learning [73.09193595292233]
We look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning.
We evaluate four model families, OPT, BLOOM, CodeGen and Codex on three semantic parsing datasets.
arXiv Detail & Related papers (2022-11-15T19:56:37Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - A Joint Learning Approach for Semi-supervised Neural Topic Modeling [25.104653662416023]
We introduce the Label-Indexed Neural Topic Model (LI-NTM), which is the first effective upstream semi-supervised neural topic model.
We find that LI-NTM outperforms existing neural topic models in document reconstruction benchmarks.
arXiv Detail & Related papers (2022-04-07T04:42:17Z) - Regularizing Generative Adversarial Networks under Limited Data [88.57330330305535]
This work proposes a regularization approach for training robust GAN models on limited data.
We show a connection between the regularized loss and an f-divergence called LeCam-divergence, which we find is more robust under limited training data.
arXiv Detail & Related papers (2021-04-07T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.