Unsupervised Chunking with Hierarchical RNN
- URL: http://arxiv.org/abs/2309.04919v1
- Date: Sun, 10 Sep 2023 02:55:12 GMT
- Title: Unsupervised Chunking with Hierarchical RNN
- Authors: Zijun Wu, Anup Anand Deshmukh, Yongkang Wu, Jimmy Lin, Lili Mou
- Abstract summary: This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner.
We present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions.
Experiments on the CoNLL-2000 dataset reveal a notable improvement over existing unsupervised methods, enhancing phrase F1 score by up to 6 percentage points.
- Score: 62.15060807493364
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: In Natural Language Processing (NLP), predicting linguistic structures, such
as parsing and chunking, has mostly relied on manual annotations of syntactic
structures. This paper introduces an unsupervised approach to chunking, a
syntactic task that involves grouping words in a non-hierarchical manner. We
present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to
model word-to-chunk and chunk-to-sentence compositions. Our approach involves a
two-stage training process: pretraining with an unsupervised parser and
finetuning on downstream NLP tasks. Experiments on the CoNLL-2000 dataset
reveal a notable improvement over existing unsupervised methods, enhancing
phrase F1 score by up to 6 percentage points. Further, finetuning with
downstream tasks results in an additional performance improvement.
Interestingly, we observe that the emergence of the chunking structure is
transient during the neural model's downstream-task training. This study
contributes to the advancement of unsupervised syntactic structure discovery
and opens avenues for further research in linguistic theory.
Related papers
- Linguistic Structure Induction from Language Models [1.8130068086063336]
This thesis focuses on producing constituency and dependency structures from Language Models (LMs) in an unsupervised setting.
I present a detailed study on StructFormer (SF) which retrofits a transformer architecture with a encoder network to produce constituency and dependency structures.
I present six experiments to analyze and address this field's challenges.
arXiv Detail & Related papers (2024-03-11T16:54:49Z) - Topic-driven Distant Supervision Framework for Macro-level Discourse
Parsing [72.14449502499535]
The task of analyzing the internal rhetorical structure of texts is a challenging problem in natural language processing.
Despite the recent advances in neural models, the lack of large-scale, high-quality corpora for training remains a major obstacle.
Recent studies have attempted to overcome this limitation by using distant supervision.
arXiv Detail & Related papers (2023-05-23T07:13:51Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Word Sense Induction with Hierarchical Clustering and Mutual Information
Maximization [14.997937028599255]
Word sense induction is a difficult problem in natural language processing.
We propose a novel unsupervised method based on hierarchical clustering and invariant information clustering.
We empirically demonstrate that, in certain cases, our approach outperforms prior WSI state-of-the-art methods.
arXiv Detail & Related papers (2022-10-11T13:04:06Z) - Co-training an Unsupervised Constituency Parser with Weak Supervision [33.63314110665062]
We introduce a method for unsupervised parsing that relies on bootstrapping classifiers to identify if a node dominates a specific span in a sentence.
We show that the interplay between them helps improve the accuracy of both, and as a result, effectively parse.
arXiv Detail & Related papers (2021-10-05T18:45:06Z) - Randomized Deep Structured Prediction for Discourse-Level Processing [45.725437752821655]
Expressive text encoders have been at the center of NLP models in recent work.
We show that we can efficiently leverage deep structured prediction and expressive neural encoders for a set of tasks involving complicated argumentative structures.
arXiv Detail & Related papers (2021-01-25T21:49:32Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.