Learning Sparse Prototypes for Text Generation
- URL: http://arxiv.org/abs/2006.16336v2
- Date: Wed, 4 Nov 2020 06:00:40 GMT
- Title: Learning Sparse Prototypes for Text Generation
- Authors: Junxian He, Taylor Berg-Kirkpatrick, Graham Neubig
- Abstract summary: Prototype-driven text generation is inefficient at test time as a result of needing to store and index the entire training corpus.
We propose a novel generative model that automatically learns a sparse prototype support set that achieves strong language modeling performance.
In experiments, our model outperforms previous prototype-driven language models while achieving up to a 1000x memory reduction.
- Score: 120.38555855991562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prototype-driven text generation uses non-parametric models that first choose
from a library of sentence "prototypes" and then modify the prototype to
generate the output text. While effective, these methods are inefficient at
test time as a result of needing to store and index the entire training corpus.
Further, existing methods often require heuristics to identify which prototypes
to reference at training time. In this paper, we propose a novel generative
model that automatically learns a sparse prototype support set that,
nonetheless, achieves strong language modeling performance. This is achieved by
(1) imposing a sparsity-inducing prior on the prototype selection distribution,
and (2) utilizing amortized variational inference to learn a prototype
retrieval function. In experiments, our model outperforms previous
prototype-driven language models while achieving up to a 1000x memory
reduction, as well as a 1000x speed-up at test time. More interestingly, we
show that the learned prototypes are able to capture semantics and syntax at
different granularity as we vary the sparsity of prototype selection, and that
certain sentence attributes can be controlled by specifying the prototype for
generation.
Related papers
- Advancing Interpretability in Text Classification through Prototype Learning [1.9526476410335776]
ProtoLens is a prototype-based model that provides fine-grained, sub-sentence level interpretability for text classification.
ProtoLens uses a Prototype-aware Span Extraction module to identify relevant text spans.
ProtoLens provides interpretable predictions while maintaining competitive accuracy.
arXiv Detail & Related papers (2024-10-23T03:53:46Z) - Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation [7.372346036256517]
Prototypical part learning is emerging as a promising approach for making semantic segmentation interpretable.
We propose a method for interpretable semantic segmentation that leverages multi-scale image representation for prototypical part learning.
Experiments conducted on Pascal VOC, Cityscapes, and ADE20K demonstrate that the proposed method increases model sparsity, improves interpretability over existing prototype-based methods, and narrows the performance gap with the non-interpretable counterpart models.
arXiv Detail & Related papers (2024-09-14T17:52:59Z) - Sign Language Translation with Iterative Prototype [104.76761930888604]
IP-SLT is a simple yet effective framework for sign language translation (SLT)
Our idea mimics the behavior of human reading, where a sentence can be digested repeatedly, till reaching accurate understanding.
arXiv Detail & Related papers (2023-08-23T15:27:50Z) - ProtoDiff: Learning to Learn Prototypical Networks by Task-Guided
Diffusion [44.805452233966534]
Prototype-based meta-learning has emerged as a powerful technique for addressing few-shot learning challenges.
We introduce ProtoDiff, a framework that gradually generates task-specific prototypes from random noise.
We conduct thorough ablation studies to demonstrate its ability to accurately capture the underlying prototype distribution.
arXiv Detail & Related papers (2023-06-26T15:26:24Z) - Evolving Semantic Prototype Improves Generative Zero-Shot Learning [73.07035277030573]
In zero-shot learning (ZSL), generative methods synthesize class-related sample features based on predefined semantic prototypes.
We observe that each class's predefined semantic prototype does not accurately match its real semantic prototype.
We propose a dynamic semantic prototype evolving (DSP) method to align the empirically predefined semantic prototypes and the real prototypes for class-related feature synthesis.
arXiv Detail & Related papers (2023-06-12T08:11:06Z) - Smaller Language Models are Better Black-box Machine-Generated Text
Detectors [56.36291277897995]
Small and partially-trained models are better universal text detectors.
We find that whether the detector and generator were trained on the same data is not critically important to the detection success.
For instance, the OPT-125M model has an AUC of 0.81 in detecting ChatGPT generations, whereas a larger model from the GPT family, GPTJ-6B, has AUC of 0.45.
arXiv Detail & Related papers (2023-05-17T00:09:08Z) - Rethinking Semantic Segmentation: A Prototype View [126.59244185849838]
We present a nonparametric semantic segmentation model based on non-learnable prototypes.
Our framework yields compelling results over several datasets.
We expect this work will provoke a rethink of the current de facto semantic segmentation model design.
arXiv Detail & Related papers (2022-03-28T21:15:32Z) - Deformable ProtoPNet: An Interpretable Image Classifier Using Deformable Prototypes [7.8515366468594765]
We present a deformable part network (Deformable ProtoPNet) that integrates the power of deep learning and the interpretability of case-based reasoning.
This model classifies input images by comparing them with prototypes learned during training, yielding explanations in the form of "this looks like that"
arXiv Detail & Related papers (2021-11-29T22:38:13Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z) - ProtoryNet - Interpretable Text Classification Via Prototype
Trajectories [4.768286204382179]
We propose a novel interpretable deep neural network for text classification, called ProtoryNet.
ProtoryNet makes a prediction by finding the most similar prototype for each sentence in a text sequence.
After prototype pruning, the resulting ProtoryNet models only need less than or around 20 prototypes for all datasets.
arXiv Detail & Related papers (2020-07-03T16:00:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.