Annotating FrameNet via Structure-Conditioned Language Generation
- URL: http://arxiv.org/abs/2406.04834v2
- Date: Tue, 25 Jun 2024 02:54:19 GMT
- Title: Annotating FrameNet via Structure-Conditioned Language Generation
- Authors: Xinyue Cui, Swabha Swayamdipta,
- Abstract summary: We propose a framework to produce novel frame-semantically annotated sentences following an overgenerate-and-filter approach.
Our results show that conditioning on rich, explicit semantic information tends to produce generations with high human acceptance.
- Score: 15.877232416259805
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the remarkable generative capabilities of language models in producing naturalistic language, their effectiveness on explicit manipulation and generation of linguistic structures remain understudied. In this paper, we investigate the task of generating new sentences preserving a given semantic structure, following the FrameNet formalism. We propose a framework to produce novel frame-semantically annotated sentences following an overgenerate-and-filter approach. Our results show that conditioning on rich, explicit semantic information tends to produce generations with high human acceptance, under both prompting and finetuning. Our generated frame-semantic structured annotations are effective at training data augmentation for frame-semantic role labeling in low-resource settings; however, we do not see benefits under higher resource settings. Our study concludes that while generating high-quality, semantically rich data might be within reach, the downstream utility of such generations remains to be seen, highlighting the outstanding challenges with automating linguistic annotation tasks.
Related papers
- FS-RAG: A Frame Semantics Based Approach for Improved Factual Accuracy in Large Language Models [2.1484130681985047]
We present a novel extension to Retrieval Augmented Generation with the goal of mitigating factual inaccuracies in the output of large language models.
Our method draws on the cognitive linguistic theory of frame semantics for the indexing and retrieval of factual information relevant to helping large language models answer queries.
arXiv Detail & Related papers (2024-06-23T17:18:19Z) - Punctuation Restoration Improves Structure Understanding without
Supervision [6.4736137270915215]
We show that punctuation restoration as a learning objective improves in- and out-of-distribution performance on structure-related tasks.
Punctuation restoration is an effective learning objective that can improve structure understanding and yield a more robust structure-aware representations of natural language.
arXiv Detail & Related papers (2024-02-13T11:22:52Z) - Instruct-SCTG: Guiding Sequential Controlled Text Generation through
Instructions [42.67608830386934]
Instruct-SCTG is a sequential framework that harnesses instruction-tuned language models to generate structurally coherent text.
Our framework generates articles in a section-by-section manner, aligned with the desired human structure using natural language instructions.
arXiv Detail & Related papers (2023-12-19T16:20:49Z) - Sentence Representation Learning with Generative Objective rather than
Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z) - Model Criticism for Long-Form Text Generation [113.13900836015122]
We apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of generated text.
We perform experiments on three representative aspects of high-level discourse -- coherence, coreference, and topicality.
We find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.
arXiv Detail & Related papers (2022-10-16T04:35:58Z) - Controllable Text Generation for Open-Domain Creativity and Fairness [36.744208990024575]
I introduce our recent works on controllable text generation to enhance the creativity and fairness of language generation models.
We explore hierarchical generation and constrained decoding, with applications to creative language generation including story, poetry, and figurative languages.
arXiv Detail & Related papers (2022-09-24T22:40:01Z) - Transferring Semantic Knowledge Into Language Encoders [6.85316573653194]
We introduce semantic form mid-tuning, an approach for transferring semantic knowledge from semantic meaning representations into language encoders.
We show that this alignment can be learned implicitly via classification or directly via triplet loss.
Our method yields language encoders that demonstrate improved predictive performance across inference, reading comprehension, textual similarity, and other semantic tasks.
arXiv Detail & Related papers (2021-10-14T14:11:12Z) - Long Text Generation by Modeling Sentence-Level and Discourse-Level
Coherence [59.51720326054546]
We propose a long text generation model, which can represent the prefix sentences at sentence level and discourse level in the decoding process.
Our model can generate more coherent texts than state-of-the-art baselines.
arXiv Detail & Related papers (2021-05-19T07:29:08Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - Improving Adversarial Text Generation by Modeling the Distant Future [155.83051741029732]
We consider a text planning scheme and present a model-based imitation-learning approach to alleviate the aforementioned issues.
We propose a novel guider network to focus on the generative process over a longer horizon, which can assist next-word prediction and provide intermediate rewards for generator optimization.
arXiv Detail & Related papers (2020-05-04T05:45:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.