Improving Unsupervised Constituency Parsing via Maximizing Semantic Information
- URL: http://arxiv.org/abs/2410.02558v2
- Date: Fri, 31 Jan 2025 11:47:07 GMT
- Title: Improving Unsupervised Constituency Parsing via Maximizing Semantic Information
- Authors: Junjie Chen, Xiangheng He, Yusuke Miyao, Danushka Bollegala,
- Abstract summary: Unsupervised constituencys organize phrases within a sentence into a tree-shaped syntactic constituent structure.
Traditional objective of maximizing sentence log-likelihood (LL) does not explicitly account for the close relationship between the constituent structure and the semantics.
We introduce a novel objective for training unsupervised metrics: maximizing the information between constituent structures and sentence semantics (SemInfo)
- Score: 35.63321102040579
- License:
- Abstract: Unsupervised constituency parsers organize phrases within a sentence into a tree-shaped syntactic constituent structure that reflects the organization of sentence semantics. However, the traditional objective of maximizing sentence log-likelihood (LL) does not explicitly account for the close relationship between the constituent structure and the semantics, resulting in a weak correlation between LL values and parsing accuracy. In this paper, we introduce a novel objective for training unsupervised parsers: maximizing the information between constituent structures and sentence semantics (SemInfo). We introduce a bag-of-substrings model to represent the semantics and apply the probability-weighted information metric to estimate the SemInfo. Additionally, we develop a Tree Conditional Random Field (TreeCRF)-based model to apply the SemInfo maximization objective to Probabilistic Context-Free Grammar (PCFG) induction, the state-of-the-art method for unsupervised constituency parsing. Experiments demonstrate that SemInfo correlates more strongly with parsing accuracy than LL. Our algorithm significantly enhances parsing accuracy by an average of 7.85 points across five PCFG variants and in four languages, achieving new state-of-the-art results in three of the four languages.
Related papers
- Structural Entropy Guided Probabilistic Coding [52.01765333755793]
We propose a novel structural entropy-guided probabilistic coding model, named SEPC.
We incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss.
Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC.
arXiv Detail & Related papers (2024-12-12T00:37:53Z) - Measuring Grammatical Diversity from Small Corpora: Derivational Entropy Rates, Mean Length of Utterances, and Annotation Invariance [0.0]
I show that a grammar's derivational entropy and the mean length of the utterances it generates are fundamentally linked.
I demonstrate that MLU is not a mere proxy, but a fundamental measure of syntactic diversity.
The derivational entropy rate indexes the rate at which different grammatical annotation frameworks determine the grammatical complexity of treebanks.
arXiv Detail & Related papers (2024-12-08T22:54:57Z) - Towards a theory of how the structure of language is acquired by deep neural networks [6.363756171493383]
We use a tree-like generative model that captures many of the hierarchical structures found in natural languages.
We show that token-token correlations can be used to build a representation of the grammar's hidden variables.
We conjecture that the relationship between training set size and effective range of correlations holds beyond our synthetic datasets.
arXiv Detail & Related papers (2024-05-28T17:01:22Z) - Empirical Sufficiency Lower Bounds for Language Modeling with
Locally-Bootstrapped Semantic Structures [4.29295838853865]
We design a concise binary vector representation of semantic structure at the lexical level.
We evaluate in-depth how good an incremental tagger needs to be in order to achieve better-than-baseline performance.
arXiv Detail & Related papers (2023-05-30T10:09:48Z) - CPTAM: Constituency Parse Tree Aggregation Method [6.011216641982612]
This paper adopts the truth discovery idea to aggregate constituency parse trees from different distances.
We formulate the constituency parse tree aggregation problem in two steps, structure aggregation and constituent label aggregation.
Experiments are conducted on benchmark datasets in different languages and domains.
arXiv Detail & Related papers (2022-01-19T23:05:37Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - Discrete representations in neural models of spoken language [56.29049879393466]
We compare the merits of four commonly used metrics in the context of weakly supervised models of spoken language.
We find that the different evaluation metrics can give inconsistent results.
arXiv Detail & Related papers (2021-05-12T11:02:02Z) - Introducing Syntactic Structures into Target Opinion Word Extraction
with Deep Learning [89.64620296557177]
We propose to incorporate the syntactic structures of the sentences into the deep learning models for targeted opinion word extraction.
We also introduce a novel regularization technique to improve the performance of the deep learning models.
The proposed model is extensively analyzed and achieves the state-of-the-art performance on four benchmark datasets.
arXiv Detail & Related papers (2020-10-26T07:13:17Z) - Exploiting Syntactic Structure for Better Language Modeling: A Syntactic
Distance Approach [78.77265671634454]
We make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called "syntactic distances"
Experimental results on the Penn Treebank and Chinese Treebank datasets show that when ground truth parse trees are provided as additional training signals, the model is able to achieve lower perplexity and induce trees with better quality.
arXiv Detail & Related papers (2020-05-12T15:35:00Z) - Discrete Optimization for Unsupervised Sentence Summarization with
Word-Level Extraction [31.648764677078837]
Automatic sentence summarization produces a shorter version of a sentence, while preserving its most important information.
We model these two aspects in an unsupervised objective function, consisting of language modeling and semantic similarity metrics.
Our proposed method achieves a new state-of-the art for unsupervised sentence summarization according to ROUGE scores.
arXiv Detail & Related papers (2020-05-04T19:01:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.