PARENTing via Model-Agnostic Reinforcement Learning to Correct
Pathological Behaviors in Data-to-Text Generation
- URL: http://arxiv.org/abs/2010.10866v2
- Date: Thu, 22 Oct 2020 13:00:20 GMT
- Title: PARENTing via Model-Agnostic Reinforcement Learning to Correct
Pathological Behaviors in Data-to-Text Generation
- Authors: Cl\'ement Rebuffel, Laure Soulier, Geoffrey Scoutheeten, Patrick
Gallinari
- Abstract summary: We show that a model-agnostic framework relying on the recently introduced PARENT metric is efficient at reducing both hallucinations and omissions.
Evaluations on the widely used WikiBIO and WebNLG benchmarks demonstrate the effectiveness of this framework compared to state-of-the-art models.
- Score: 11.687228500584082
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In language generation models conditioned by structured data, the classical
training via maximum likelihood almost always leads models to pick up on
dataset divergence (i.e., hallucinations or omissions), and to incorporate them
erroneously in their own generations at inference. In this work, we build ontop
of previous Reinforcement Learning based approaches and show that a
model-agnostic framework relying on the recently introduced PARENT metric is
efficient at reducing both hallucinations and omissions. Evaluations on the
widely used WikiBIO and WebNLG benchmarks demonstrate the effectiveness of this
framework compared to state-of-the-art models.
Related papers
- Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all [1.507700065820919]
Recent advancements in transcriptomics sequencing provide new opportunities to uncover valuable insights.
No benchmark has been made to robustly evaluate the effectiveness of these rising models for perturbation analysis.
This article presents a novel biologically motivated evaluation framework and a hierarchy of perturbation analysis tasks.
arXiv Detail & Related papers (2024-10-17T18:27:51Z) - PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis [14.526536510805755]
We present a comprehensive framework for predicting the effects of perturbations in single cells, designed to standardize benchmarking in this rapidly evolving field.
Our framework, PerturBench, includes a user-friendly platform, diverse datasets, metrics for fair model comparison, and detailed performance analysis.
arXiv Detail & Related papers (2024-08-20T07:40:20Z) - Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models [19.015202590038996]
We evaluate the factuality of different models tuned by various preference learning algorithms.
We propose textbfAPEFT (textbfAtomic textbfPreference textbfEnhanced textbfFactuality textbfTuning) to enhance model's awareness of factuality.
arXiv Detail & Related papers (2024-06-18T09:07:30Z) - Collaborative decoding of critical tokens for boosting factuality of
large language models [57.504894664689]
Finetuned and aligned models show improved abilities of instruction following and safe generation.
The common practice of using sampling during generation also increases chances of hallucination.
We introduce a collaborative decoding framework to harness the high factuality within pretrained models through the concept of critical tokens.
arXiv Detail & Related papers (2024-02-28T01:53:37Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Retrieval-Enhanced Contrastive Vision-Text Models [61.783728119255365]
We propose to equip vision-text models with the ability to refine their embedding with cross-modal retrieved information from a memory at inference time.
Remarkably, we show that this can be done with a light-weight, single-layer, fusion transformer on top of a frozen CLIP.
Our experiments validate that our retrieval-enhanced contrastive (RECO) training improves CLIP performance substantially on several challenging fine-grained tasks.
arXiv Detail & Related papers (2023-06-12T15:52:02Z) - On the Compositional Generalization Gap of In-Context Learning [73.09193595292233]
We look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning.
We evaluate four model families, OPT, BLOOM, CodeGen and Codex on three semantic parsing datasets.
arXiv Detail & Related papers (2022-11-15T19:56:37Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - A Joint Learning Approach for Semi-supervised Neural Topic Modeling [25.104653662416023]
We introduce the Label-Indexed Neural Topic Model (LI-NTM), which is the first effective upstream semi-supervised neural topic model.
We find that LI-NTM outperforms existing neural topic models in document reconstruction benchmarks.
arXiv Detail & Related papers (2022-04-07T04:42:17Z) - Regularizing Generative Adversarial Networks under Limited Data [88.57330330305535]
This work proposes a regularization approach for training robust GAN models on limited data.
We show a connection between the regularized loss and an f-divergence called LeCam-divergence, which we find is more robust under limited training data.
arXiv Detail & Related papers (2021-04-07T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.