Back to the Future: Unsupervised Backprop-based Decoding for
Counterfactual and Abductive Commonsense Reasoning
- URL: http://arxiv.org/abs/2010.05906v4
- Date: Mon, 2 Aug 2021 19:03:40 GMT
- Title: Back to the Future: Unsupervised Backprop-based Decoding for
Counterfactual and Abductive Commonsense Reasoning
- Authors: Lianhui Qin, Vered Shwartz, Peter West, Chandra Bhagavatula, Jena
Hwang, Ronan Le Bras, Antoine Bosselut, Yejin Choi
- Abstract summary: generative language models (LMs) can be trained to condition only on the past context or to perform narrowly scoped text-infilling.
We propose DeLorean, a new unsupervised decoding algorithm that can flexibly incorporate both the past and future contexts.
We demonstrate that our approach is general and applicable to two nonmonotonic reasoning tasks: abductive text generation and counterfactual story revision.
- Score: 79.48769764508006
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abductive and counterfactual reasoning, core abilities of everyday human
cognition, require reasoning about what might have happened at time t, while
conditioning on multiple contexts from the relative past and future. However,
simultaneous incorporation of past and future contexts using generative
language models (LMs) can be challenging, as they are trained either to
condition only on the past context or to perform narrowly scoped
text-infilling. In this paper, we propose DeLorean, a new unsupervised decoding
algorithm that can flexibly incorporate both the past and future contexts using
only off-the-shelf, left-to-right language models and no supervision. The key
intuition of our algorithm is incorporating the future through
back-propagation, during which, we only update the internal representation of
the output while fixing the model parameters. By alternating between forward
and backward propagation, DeLorean can decode the output representation that
reflects both the left and right contexts. We demonstrate that our approach is
general and applicable to two nonmonotonic reasoning tasks: abductive text
generation and counterfactual story revision, where DeLorean outperforms a
range of unsupervised and some supervised methods, based on automatic and human
evaluation.
Related papers
- Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines [74.42485647685272]
We focus on Generative Masked Language Models (GMLMs)
We train a model to fit conditional probabilities of the data distribution via masking, which are subsequently used as inputs to a Markov Chain to draw samples from the model.
We adapt the T5 model for iteratively-refined parallel decoding, achieving 2-3x speedup in machine translation with minimal sacrifice in quality.
arXiv Detail & Related papers (2024-07-22T18:00:00Z) - When Only Time Will Tell: Interpreting How Transformers Process Local Ambiguities Through the Lens of Restart-Incrementality [19.103130032967663]
Causal models are forced to output one interpretation and continue, whereas models that can revise may edit their previous output as the ambiguity is resolved.
In this work, we look at how restart-incremental Transformers build and update internal states, in an effort to shed light on what processes cause revisions not viable in autoregressive models.
arXiv Detail & Related papers (2024-02-20T16:09:49Z) - Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation
of the Reversal Curse [73.65112477688353]
Recent studies have highlighted a phenomenon in large language models known as "the reversal curse"
We contend that the reversal curse is partially a result of specific model training objectives.
We propose a novel training method, BI Casual language modeling Optimization (BICO), designed to mitigate the reversal curse.
arXiv Detail & Related papers (2023-11-13T17:01:12Z) - RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - Look-back Decoding for Open-Ended Text Generation [62.53302138266465]
We propose Look-back, an improved decoding algorithm that tracks the distribution distance between current and historical decoding steps.
Look-back can automatically predict potential repetitive phrase and topic drift, and remove tokens that may cause the failure modes.
We perform decoding experiments on document continuation and story generation, and demonstrate that Look-back is able to generate more fluent and coherent text.
arXiv Detail & Related papers (2023-05-22T20:42:37Z) - Future Sight: Dynamic Story Generation with Large Pretrained Language
Models [11.23192733149335]
Transformer decoders can only generate new text with respect to previously generated text.
Future Sight enables a decoder to attend to an encoded future plot event.
During inference, the future plot event can be written by a human author to steer the narrative being generated in a certain direction.
arXiv Detail & Related papers (2022-12-20T01:53:26Z) - Reflective Decoding: Beyond Unidirectional Generation with Off-the-Shelf
Language Models [63.808843089941405]
Large pretrained LanguageModels (LMs) generate text with remarkable quality, but only sequentially from left to right.
We present Reflective Decoding, a novel unsupervised algorithm that allows for direct application of unidirectional LMs to non-sequential tasks.
Our 2-step approach requires no supervision or even parallel corpora, only two off-the-shelf pretrained LMs in opposite directions.
arXiv Detail & Related papers (2020-10-16T18:02:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.