Dissecting Generation Modes for Abstractive Summarization Models via
Ablation and Attribution
- URL: http://arxiv.org/abs/2106.01518v1
- Date: Thu, 3 Jun 2021 00:54:16 GMT
- Title: Dissecting Generation Modes for Abstractive Summarization Models via
Ablation and Attribution
- Authors: Jiacheng Xu and Greg Durrett
- Abstract summary: We propose a two-step method to interpret summarization model decisions.
We first analyze the model's behavior by ablating the full model to categorize each decoder decision into one of several generation modes.
After isolating decisions that do depend on the input, we explore interpreting these decisions using several different attribution methods.
- Score: 34.2658286826597
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the prominence of neural abstractive summarization models, we know
little about how they actually form summaries and how to understand where their
decisions come from. We propose a two-step method to interpret summarization
model decisions. We first analyze the model's behavior by ablating the full
model to categorize each decoder decision into one of several generation modes:
roughly, is the model behaving like a language model, is it relying heavily on
the input, or is it somewhere in between? After isolating decisions that do
depend on the input, we explore interpreting these decisions using several
different attribution methods. We compare these techniques based on their
ability to select content and reconstruct the model's predicted token from
perturbations of the input, thus revealing whether highlighted attributions are
truly important for the generation of the next token. While this machinery can
be broadly useful even beyond summarization, we specifically demonstrate its
capability to identify phrases the summarization model has memorized and
determine where in the training pipeline this memorization happened, as well as
study complex generation phenomena like sentence fusion on a per-instance
basis.
Related papers
- DISCO: DISCovering Overfittings as Causal Rules for Text Classification Models [6.369258625916601]
Post-hoc interpretability methods fail to capture the models' decision-making process fully.
Our paper introduces DISCO, a novel method for discovering global, rule-based explanations.
DISCO supports interactive explanations, enabling human inspectors to distinguish spurious causes in the rule-based output.
arXiv Detail & Related papers (2024-11-07T12:12:44Z) - Heat Death of Generative Models in Closed-Loop Learning [63.83608300361159]
We study the learning dynamics of generative models that are fed back their own produced content in addition to their original training dataset.
We show that, unless a sufficient amount of external data is introduced at each iteration, any non-trivial temperature leads the model to degenerate.
arXiv Detail & Related papers (2024-04-02T21:51:39Z) - Towards Interpretable Deep Reinforcement Learning Models via Inverse
Reinforcement Learning [27.841725567976315]
We propose a novel framework utilizing Adversarial Inverse Reinforcement Learning.
This framework provides global explanations for decisions made by a Reinforcement Learning model.
We capture intuitive tendencies that the model follows by summarizing the model's decision-making process.
arXiv Detail & Related papers (2022-03-30T17:01:59Z) - Speech Summarization using Restricted Self-Attention [79.89680891246827]
We introduce a single model optimized end-to-end for speech summarization.
We demonstrate that the proposed model learns to directly summarize speech for the How-2 corpus of instructional videos.
arXiv Detail & Related papers (2021-10-12T18:21:23Z) - Improving Faithfulness in Abstractive Summarization with Contrast
Candidate Generation and Selection [54.38512834521367]
We study contrast candidate generation and selection as a model-agnostic post-processing technique.
We learn a discriminative correction model by generating alternative candidate summaries.
This model is then used to select the best candidate as the final output summary.
arXiv Detail & Related papers (2021-04-19T05:39:24Z) - Paired Examples as Indirect Supervision in Latent Decision Models [109.76417071249945]
We introduce a way to leverage paired examples that provide stronger cues for learning latent decisions.
We apply our method to improve compositional question answering using neural module networks on the DROP dataset.
arXiv Detail & Related papers (2021-04-05T03:58:30Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Understanding Neural Abstractive Summarization Models via Uncertainty [54.37665950633147]
seq2seq abstractive summarization models generate text in a free-form manner.
We study the entropy, or uncertainty, of the model's token-level predictions.
We show that uncertainty is a useful perspective for analyzing summarization and text generation models more broadly.
arXiv Detail & Related papers (2020-10-15T16:57:27Z) - Learning Invariances for Interpretability using Supervised VAE [0.0]
We learn model invariances as a means of interpreting a model.
We propose a supervised form of variational auto-encoders (VAEs)
We show how combining our model with feature attribution methods it is possible to reach a more fine-grained understanding about the decision process of the model.
arXiv Detail & Related papers (2020-07-15T10:14:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.