CDM: Combining Extraction and Generation for Definition Modeling
- URL: http://arxiv.org/abs/2111.07267v1
- Date: Sun, 14 Nov 2021 08:03:18 GMT
- Title: CDM: Combining Extraction and Generation for Definition Modeling
- Authors: Jie Huang, Hanyin Shao, Kevin Chen-Chuan Chang
- Abstract summary: We propose to combine extraction and generation for definition modeling.
First extract self- and correlative definitional information of target terms from the Web.
Then generate the final definitions by incorporating the extracted definitional information.
- Score: 8.487707405248242
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Definitions are essential for term understanding. Recently, there is an
increasing interest in extracting and generating definitions of terms
automatically. However, existing approaches for this task are either extractive
or abstractive - definitions are either extracted from a corpus or generated by
a language generation model. In this paper, we propose to combine extraction
and generation for definition modeling: first extract self- and correlative
definitional information of target terms from the Web and then generate the
final definitions by incorporating the extracted definitional information.
Experiments demonstrate our framework can generate high-quality definitions for
technical terms and outperform state-of-the-art models for definition modeling
significantly.
Related papers
- Domain Embeddings for Generating Complex Descriptions of Concepts in
Italian Language [65.268245109828]
We propose a Distributional Semantic resource enriched with linguistic and lexical information extracted from electronic dictionaries.
The resource comprises 21 domain-specific matrices, one comprehensive matrix, and a Graphical User Interface.
Our model facilitates the generation of reasoned semantic descriptions of concepts by selecting matrices directly associated with concrete conceptual knowledge.
arXiv Detail & Related papers (2024-02-26T15:04:35Z) - Exploiting Contextual Target Attributes for Target Sentiment
Classification [53.30511968323911]
Existing PTLM-based models for TSC can be categorized into two groups: 1) fine-tuning-based models that adopt PTLM as the context encoder; 2) prompting-based models that transfer the classification task to the text/word generation task.
We present a new perspective of leveraging PTLM for TSC: simultaneously leveraging the merits of both language modeling and explicit target-context interactions via contextual target attributes.
arXiv Detail & Related papers (2023-12-21T11:45:28Z) - "Definition Modeling: To model definitions." Generating Definitions With
Little to No Semantics [0.4061135251278187]
We present evidence that the task may not involve as much semantics as one might expect.
We show how an earlier model from the literature is both rather insensitive to semantic aspects such as explicit polysemy.
arXiv Detail & Related papers (2023-06-14T11:08:38Z) - Exploiting Correlations Between Contexts and Definitions with Multiple
Definition Modeling [13.608157331662026]
Single Definition Modeling (SDM) does not adequately model the correlations and patterns among different contexts and definitions of words.
In this paper, we design a new task called Multiple Definition Modeling (MDM) that pool together all contexts and definition of target words.
We demonstrate and analyze the benefits of MDM, including improving SDM's performance by using MDM as the pretraining task and its comparable performance in the zero-shot setting.
arXiv Detail & Related papers (2023-05-24T04:38:29Z) - Model Criticism for Long-Form Text Generation [113.13900836015122]
We apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of generated text.
We perform experiments on three representative aspects of high-level discourse -- coherence, coreference, and topicality.
We find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.
arXiv Detail & Related papers (2022-10-16T04:35:58Z) - Fine-grained Contrastive Learning for Definition Generation [10.549051541793544]
Previous encoder-decoder models lack effective representation learning to contain full semantic components of the given word.
We propose a novel contrastive learning method, encouraging the model to capture more detailed semantic representations from the definition sequence encoding.
arXiv Detail & Related papers (2022-10-02T14:55:01Z) - COMPILING: A Benchmark Dataset for Chinese Complexity Controllable
Definition Generation [2.935516292500541]
This paper proposes a novel task of generating definitions for a word with controllable complexity levels.
We introduce COMPILING, a dataset given detailed information about Chinese definitions, and each definition is labeled with its complexity levels.
arXiv Detail & Related papers (2022-09-29T08:17:53Z) - Automatic Concept Extraction for Concept Bottleneck-based Video
Classification [58.11884357803544]
We present an automatic Concept Discovery and Extraction module that rigorously composes a necessary and sufficient set of concept abstractions for concept-based video classification.
Our method elicits inherent complex concept abstractions in natural language to generalize concept-bottleneck methods to complex tasks.
arXiv Detail & Related papers (2022-06-21T06:22:35Z) - Compositional Visual Generation with Composable Diffusion Models [80.75258849913574]
We propose an alternative structured approach for compositional generation using diffusion models.
An image is generated by composing a set of diffusion models, with each of them modeling a certain component of the image.
The proposed method can generate scenes at test time that are substantially more complex than those seen in training.
arXiv Detail & Related papers (2022-06-03T17:47:04Z) - VCDM: Leveraging Variational Bi-encoding and Deep Contextualized Word
Representations for Improved Definition Modeling [24.775371434410328]
We tackle the task of definition modeling, where the goal is to learn to generate definitions of words and phrases.
Existing approaches for this task are discriminative, combining distributional and lexical semantics in an implicit rather than direct way.
We propose a generative model for the task, introducing a continuous latent variable to explicitly model the underlying relationship between a phrase used within a context and its definition.
arXiv Detail & Related papers (2020-10-07T02:48:44Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.