Punctuation Restoration Improves Structure Understanding without
Supervision
- URL: http://arxiv.org/abs/2402.08382v2
- Date: Wed, 21 Feb 2024 08:35:57 GMT
- Title: Punctuation Restoration Improves Structure Understanding without
Supervision
- Authors: Junghyun Min, Minho Lee, Woochul Lee, Yeonsoo Lee
- Abstract summary: We show that punctuation restoration as a learning objective improves in- and out-of-distribution performance on structure-related tasks.
Punctuation restoration is an effective learning objective that can improve structure understanding and yield a more robust structure-aware representations of natural language.
- Score: 6.4736137270915215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised learning objectives like language modeling and de-noising
constitute a significant part in producing pre-trained models that perform
various downstream applications from natural language understanding to
conversational tasks. However, despite impressive generative capabilities of
recent large language models, their abilities to capture syntactic or semantic
structure within text lag behind. We hypothesize that the mismatch between
linguistic performance and competence in machines is attributable to
insufficient transfer of linguistic structure knowledge to computational
systems with currently popular pre-training objectives. We show that
punctuation restoration as a learning objective improves in- and
out-of-distribution performance on structure-related tasks like named entity
recognition, open information extraction, chunking, and part-of-speech tagging.
Punctuation restoration is an effective learning objective that can improve
structure understanding and yield a more robust structure-aware representations
of natural language.
Related papers
- Annotating FrameNet via Structure-Conditioned Language Generation [15.877232416259805]
We propose a framework to produce novel frame-semantically annotated sentences following an overgenerate-and-filter approach.
Our results show that conditioning on rich, explicit semantic information tends to produce generations with high human acceptance.
arXiv Detail & Related papers (2024-06-07T11:01:15Z) - Prompting Language Models for Linguistic Structure [73.11488464916668]
We present a structured prompting approach for linguistic structured prediction tasks.
We evaluate this approach on part-of-speech tagging, named entity recognition, and sentence chunking.
We find that while PLMs contain significant prior knowledge of task labels due to task leakage into the pretraining corpus, structured prompting can also retrieve linguistic structure with arbitrary labels.
arXiv Detail & Related papers (2022-11-15T01:13:39Z) - Emergent Linguistic Structures in Neural Networks are Fragile [20.692540987792732]
Large Language Models (LLMs) have been reported to have strong performance on natural language processing tasks.
We propose a framework to assess the consistency and robustness of linguistic representations.
arXiv Detail & Related papers (2022-10-31T15:43:57Z) - An Empirical Revisiting of Linguistic Knowledge Fusion in Language
Understanding Tasks [33.765874588342285]
Infusing language models with syntactic or semantic knowledge from structural linguistic priors has shown improvements on many language understanding tasks.
We conduct empirical study of replacing parsed graphs or trees with trivial ones for tasks in the GLUE benchmark.
It reveals that the gains might not be significantly attributed to explicit linguistic priors but rather to more feature interactions brought by fusion layers.
arXiv Detail & Related papers (2022-10-24T07:47:32Z) - Sentence Representation Learning with Generative Objective rather than
Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z) - DeepStruct: Pretraining of Language Models for Structure Prediction [64.84144849119554]
We pretrain language models on a collection of task-agnostic corpora to generate structures from text.
Our structure pretraining enables zero-shot transfer of the learned knowledge that models have about the structure tasks.
We show that a 10B parameter language model transfers non-trivially to most tasks and obtains state-of-the-art performance on 21 of 28 datasets.
arXiv Detail & Related papers (2022-05-21T00:58:22Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z) - Retrofitting Structure-aware Transformer Language Model for End Tasks [34.74181162627023]
We consider retrofitting structure-aware Transformer language model for facilitating end tasks.
Middle-layer structural learning strategy is leveraged for structure integration.
Experimental results show that the retrofitted structure-aware Transformer language model achieves improved perplexity.
arXiv Detail & Related papers (2020-09-16T01:07:07Z) - Semantics-Aware Inferential Network for Natural Language Understanding [79.70497178043368]
We propose a Semantics-Aware Inferential Network (SAIN) to meet such a motivation.
Taking explicit contextualized semantics as a complementary input, the inferential module of SAIN enables a series of reasoning steps over semantic clues.
Our model achieves significant improvement on 11 tasks including machine reading comprehension and natural language inference.
arXiv Detail & Related papers (2020-04-28T07:24:43Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.