A Distributional Approach to Controlled Text Generation
- URL: http://arxiv.org/abs/2012.11635v1
- Date: Mon, 21 Dec 2020 19:02:41 GMT
- Title: A Distributional Approach to Controlled Text Generation
- Authors: Muhammad Khalifa, Hady Elsahar, Marc Dymetman
- Abstract summary: We propose a Distributional Approach to address Controlled Text Generation from pre-trained Language Models (LMs)
This view permits to define, in a single formal framework, "pointwise" and "distributional" constraints over the target LM.
We then perform experiments over distributional constraints, a unique feature of our approach, demonstrating its potential as a remedy to the problem of Bias in Language Models.
- Score: 3.279201607581627
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a Distributional Approach to address Controlled Text Generation
from pre-trained Language Models (LMs). This view permits to define, in a
single formal framework, "pointwise" and "distributional" constraints over the
target LM -- to our knowledge, this is the first approach with such generality
-- while minimizing KL divergence with the initial LM distribution. The optimal
target distribution is then uniquely determined as an explicit EBM
(Energy-Based Model) representation. From that optimal representation we then
train the target controlled autoregressive LM through an adaptive
distributional variant of Policy Gradient. We conduct a first set of
experiments over pointwise constraints showing the advantages of our approach
over a set of baselines, in terms of obtaining a controlled LM balancing
constraint satisfaction with divergence from the initial LM (GPT-2). We then
perform experiments over distributional constraints, a unique feature of our
approach, demonstrating its potential as a remedy to the problem of Bias in
Language Models. Through an ablation study we show the effectiveness of our
adaptive technique for obtaining faster convergence.
Related papers
- Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification [76.14641982122696]
We propose a constraint learning schema for fine-tuning Large Language Models (LLMs) with attribute control.
We show that our approach leads to an LLM that produces fewer inappropriate responses while achieving competitive performance on benchmarks and a toxicity detection task.
arXiv Detail & Related papers (2024-10-07T23:38:58Z) - Distributional Preference Alignment of LLMs via Optimal Transport [36.95053112313244]
We propose a novel method for distributional preference alignment of LLMs called Alignment via Optimal Transport (AOT)
AOT aligns LLMs on unpaired preference data by making the reward distribution of the positive samplesally dominant in the first order on the distribution of negative samples.
We show that AOT leads to state-of-the-art models in the 7B family of models when evaluated with Open LLM Benchmarks and AlpacaEval.
arXiv Detail & Related papers (2024-06-09T18:41:05Z) - TransFusion: Covariate-Shift Robust Transfer Learning for High-Dimensional Regression [11.040033344386366]
We propose a two-step method with a novel fused-regularizer to improve the learning performance on a target task with limited samples.
Nonasymptotic bound is provided for the estimation error of the target model.
We extend the method to a distributed setting, allowing for a pretraining-finetuning strategy.
arXiv Detail & Related papers (2024-04-01T14:58:16Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - STEEL: Singularity-aware Reinforcement Learning [14.424199399139804]
Batch reinforcement learning (RL) aims at leveraging pre-collected data to find an optimal policy.
We propose a new batch RL algorithm that allows for singularity for both state and action spaces.
By leveraging the idea of pessimism and under some technical conditions, we derive a first finite-sample regret guarantee for our proposed algorithm.
arXiv Detail & Related papers (2023-01-30T18:29:35Z) - Learning Sampling Distributions for Model Predictive Control [36.82905770866734]
Sampling-based approaches to Model Predictive Control (MPC) have become a cornerstone of contemporary approaches to MPC.
We propose to carry out all operations in the latent space, allowing us to take full advantage of the learned distribution.
Specifically, we frame the learning problem as bi-level optimization and show how to train the controller with backpropagation-through-time.
arXiv Detail & Related papers (2022-12-05T20:35:36Z) - Domain-Specific Risk Minimization for Out-of-Distribution Generalization [104.17683265084757]
We first establish a generalization bound that explicitly considers the adaptivity gap.
We propose effective gap estimation methods for guiding the selection of a better hypothesis for the target.
The other method is minimizing the gap directly by adapting model parameters using online target samples.
arXiv Detail & Related papers (2022-08-18T06:42:49Z) - Cooperative Distribution Alignment via JSD Upper Bound [7.071749623370137]
Unsupervised distribution alignment estimates a transformation that maps two or more source distributions to a shared aligned distribution.
This task has many applications including generative modeling, unsupervised domain adaptation, and socially aware learning.
We propose to unify and generalize previous flow-based approaches under a single non-adversarial framework.
arXiv Detail & Related papers (2022-07-05T20:09:03Z) - InteL-VAEs: Adding Inductive Biases to Variational Auto-Encoders via
Intermediary Latents [60.785317191131284]
We introduce a simple and effective method for learning VAEs with controllable biases by using an intermediary set of latent variables.
In particular, it allows us to impose desired properties like sparsity or clustering on learned representations.
We show that this, in turn, allows InteL-VAEs to learn both better generative models and representations.
arXiv Detail & Related papers (2021-06-25T16:34:05Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z) - An Information Bottleneck Approach for Controlling Conciseness in
Rationale Extraction [84.49035467829819]
We show that it is possible to better manage this trade-off by optimizing a bound on the Information Bottleneck (IB) objective.
Our fully unsupervised approach jointly learns an explainer that predicts sparse binary masks over sentences, and an end-task predictor that considers only the extracted rationale.
arXiv Detail & Related papers (2020-05-01T23:26:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.