HydraSum -- Disentangling Stylistic Features in Text Summarization using
Multi-Decoder Models
- URL: http://arxiv.org/abs/2110.04400v1
- Date: Fri, 8 Oct 2021 22:49:49 GMT
- Title: HydraSum -- Disentangling Stylistic Features in Text Summarization using
Multi-Decoder Models
- Authors: Tanya Goyal, Nazneen Fatema Rajani, Wenhao Liu, Wojciech
Kry\'sci\'nski
- Abstract summary: We introduce HydraSum, a new summarization architecture that extends the single decoder framework of current models.
Our proposed model encourages each expert, i.e. decoder, to learn and generate stylistically-distinct summaries.
A guided version of the training process can explicitly govern which summary style is partitioned between decoders.
- Score: 12.070474521259776
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing abstractive summarization models lack explicit control mechanisms
that would allow users to influence the stylistic features of the model
outputs. This results in generating generic summaries that do not cater to the
users needs or preferences. To address this issue we introduce HydraSum, a new
summarization architecture that extends the single decoder framework of current
models, e.g. BART, to a mixture-of-experts version consisting of multiple
decoders. Our proposed model encourages each expert, i.e. decoder, to learn and
generate stylistically-distinct summaries along dimensions such as
abstractiveness, length, specificity, and others. At each time step, HydraSum
employs a gating mechanism that decides the contribution of each individual
decoder to the next token's output probability distribution. Through
experiments on three summarization datasets (CNN, Newsroom, XSum), we
demonstrate that this gating mechanism automatically learns to assign
contrasting summary styles to different HydraSum decoders under the standard
training objective without the need for additional supervision. We further show
that a guided version of the training process can explicitly govern which
summary style is partitioned between decoders, e.g. high abstractiveness vs.
low abstractiveness or high specificity vs. low specificity, and also increase
the stylistic-difference between individual decoders. Finally, our experiments
demonstrate that our decoder framework is highly flexible: during inference, we
can sample from individual decoders or mixtures of different subsets of the
decoders to yield a diverse set of summaries and enforce single- and
multi-style control over summary generation.
Related papers
- LCM-Lookahead for Encoder-based Text-to-Image Personalization [82.56471486184252]
We explore the potential of using shortcut-mechanisms to guide the personalization of text-to-image models.
We focus on encoder-based personalization approaches, and demonstrate that by tuning them with a lookahead identity loss, we can achieve higher identity fidelity.
arXiv Detail & Related papers (2024-04-04T17:43:06Z) - Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding [90.77521413857448]
Deep generative models are anchored in three core capabilities -- generating new instances, reconstructing inputs, and learning compact representations.
We introduce Generalized generative adversarial-Decoding Diffusion Probabilistic Models (EDDPMs)
EDDPMs generalize the Gaussian noising-denoising in standard diffusion by introducing parameterized encoding-decoding.
Experiments on text, proteins, and images demonstrate the flexibility to handle diverse data and tasks.
arXiv Detail & Related papers (2024-02-29T10:08:57Z) - Triple-View Knowledge Distillation for Semi-Supervised Semantic
Segmentation [54.23510028456082]
We propose a Triple-view Knowledge Distillation framework, termed TriKD, for semi-supervised semantic segmentation.
The framework includes the triple-view encoder and the dual-frequency decoder.
arXiv Detail & Related papers (2023-09-22T01:02:21Z) - Sequence-to-Sequence Pre-training with Unified Modality Masking for
Visual Document Understanding [3.185382039518151]
GenDoc is a sequence-to-sequence document understanding model pre-trained with unified masking across three modalities.
The proposed model utilizes an encoder-decoder architecture, which allows for increased adaptability to a wide range of downstream tasks.
arXiv Detail & Related papers (2023-05-16T15:25:19Z) - String-based Molecule Generation via Multi-decoder VAE [56.465033997245776]
We investigate the problem of string-based molecular generation via variational autoencoders (VAEs)
We propose a simple, yet effective idea to improve the performance of VAE for the task.
In our experiments, the proposed VAE model particularly performs well for generating a sample from out-of-domain distribution.
arXiv Detail & Related papers (2022-08-23T03:56:30Z) - Multi-scale and Cross-scale Contrastive Learning for Semantic
Segmentation [5.281694565226513]
We apply contrastive learning to enhance the discriminative power of the multi-scale features extracted by semantic segmentation networks.
By first mapping the encoder's multi-scale representations to a common feature space, we instantiate a novel form of supervised local-global constraint.
arXiv Detail & Related papers (2022-03-25T01:24:24Z) - CCVS: Context-aware Controllable Video Synthesis [95.22008742695772]
presentation introduces a self-supervised learning approach to the synthesis of new video clips from old ones.
It conditions the synthesis process on contextual information for temporal continuity and ancillary information for fine control.
arXiv Detail & Related papers (2021-07-16T17:57:44Z) - Rethinking and Improving Natural Language Generation with Layer-Wise
Multi-View Decoding [59.48857453699463]
In sequence-to-sequence learning, the decoder relies on the attention mechanism to efficiently extract information from the encoder.
Recent work has proposed to use representations from different encoder layers for diversified levels of information.
We propose layer-wise multi-view decoding, where for each decoder layer, together with the representations from the last encoder layer, which serve as a global view, those from other encoder layers are supplemented for a stereoscopic view of the source sequences.
arXiv Detail & Related papers (2020-05-16T20:00:39Z) - Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven
Cloze Reward [42.925345819778656]
We present ASGARD, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD.
We propose the use of dual encoders---a sequential document encoder and a graph-structured encoder---to maintain the global context and local characteristics of entities.
Results show that our models produce significantly higher ROUGE scores than a variant without knowledge graph as input on both New York Times and CNN/Daily Mail datasets.
arXiv Detail & Related papers (2020-05-03T18:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.