Fine-grained Contrastive Learning for Definition Generation
- URL: http://arxiv.org/abs/2210.00543v1
- Date: Sun, 2 Oct 2022 14:55:01 GMT
- Title: Fine-grained Contrastive Learning for Definition Generation
- Authors: Hengyuan Zhang, Dawei Li, Shiping Yang, Yanran Li
- Abstract summary: Previous encoder-decoder models lack effective representation learning to contain full semantic components of the given word.
We propose a novel contrastive learning method, encouraging the model to capture more detailed semantic representations from the definition sequence encoding.
- Score: 10.549051541793544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, pre-trained transformer-based models have achieved great success in
the task of definition generation (DG). However, previous encoder-decoder
models lack effective representation learning to contain full semantic
components of the given word, which leads to generating under-specific
definitions. To address this problem, we propose a novel contrastive learning
method, encouraging the model to capture more detailed semantic representations
from the definition sequence encoding. According to both automatic and manual
evaluation, the experimental results on three mainstream benchmarks demonstrate
that the proposed method could generate more specific and high-quality
definitions compared with several state-of-the-art models.
Related papers
- Vector-Quantized Prompt Learning for Paraphrase Generation [18.40940464497253]
This paper proposes to generate diverse and high-quality paraphrases by exploiting the pre-trained models with instance-dependent prompts.
Extensive experiments demonstrate that the proposed method achieves new state-of-art results on three benchmark datasets.
arXiv Detail & Related papers (2023-11-25T07:13:06Z) - Precision-Recall Divergence Optimization for Generative Modeling with
GANs and Normalizing Flows [54.050498411883495]
We develop a novel training method for generative models, such as Generative Adversarial Networks and Normalizing Flows.
We show that achieving a specified precision-recall trade-off corresponds to minimizing a unique $f$-divergence from a family we call the textitPR-divergences.
Our approach improves the performance of existing state-of-the-art models like BigGAN in terms of either precision or recall when tested on datasets such as ImageNet.
arXiv Detail & Related papers (2023-05-30T10:07:17Z) - Sequence-to-Sequence Pre-training with Unified Modality Masking for
Visual Document Understanding [3.185382039518151]
GenDoc is a sequence-to-sequence document understanding model pre-trained with unified masking across three modalities.
The proposed model utilizes an encoder-decoder architecture, which allows for increased adaptability to a wide range of downstream tasks.
arXiv Detail & Related papers (2023-05-16T15:25:19Z) - Vector Quantized Wasserstein Auto-Encoder [57.29764749855623]
We study learning deep discrete representations from the generative viewpoint.
We endow discrete distributions over sequences of codewords and learn a deterministic decoder that transports the distribution over the sequences of codewords to the data distribution.
We develop further theories to connect it with the clustering viewpoint of WS distance, allowing us to have a better and more controllable clustering solution.
arXiv Detail & Related papers (2023-02-12T13:51:36Z) - Learning Disentangled Representations for Natural Language Definitions [0.0]
We argue that recurrent syntactic and semantic regularities in textual data can be used to provide the models with both structural biases and generative factors.
We leverage the semantic structures present in a representative and semantically dense category of sentence types, definitional sentences, for training a Variational Autoencoder to learn disentangled representations.
arXiv Detail & Related papers (2022-09-22T14:31:55Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Hierarchical Sketch Induction for Paraphrase Generation [79.87892048285819]
We introduce Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE), a method for learning decompositions of dense encodings.
We use HRQ-VAE to encode the syntactic form of an input sentence as a path through the hierarchy, allowing us to more easily predict syntactic sketches at test time.
arXiv Detail & Related papers (2022-03-07T15:28:36Z) - Entropy optimized semi-supervised decomposed vector-quantized
variational autoencoder model based on transfer learning for multiclass text
classification and generation [3.9318191265352196]
We propose a semisupervised discrete latent variable model for multi-class text classification and text generation.
The proposed model employs the concept of transfer learning for training a quantized transformer model.
Experimental results indicate that the proposed model has surpassed the state-of-the-art models remarkably.
arXiv Detail & Related papers (2021-11-10T07:07:54Z) - Improving Non-autoregressive Generation with Mixup Training [51.61038444990301]
We present a non-autoregressive generation model based on pre-trained transformer models.
We propose a simple and effective iterative training method called MIx Source and pseudo Target.
Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results.
arXiv Detail & Related papers (2021-10-21T13:04:21Z) - Generating Diverse Descriptions from Semantic Graphs [38.28044884015192]
We present a graph-to-text model, incorporating a latent variable in an an-decoder model, and its use in an ensemble.
We show an ensemble of models produces diverse sets of generated sentences, while retaining similar quality to state-of-the-art models.
We evaluate the models on WebNLG datasets in English and Russian, and show an ensemble of models produces diverse sets of generated sentences, while retaining similar quality to state-of-the-art models.
arXiv Detail & Related papers (2021-08-12T11:00:09Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.