PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with
Many Symbols
- URL: http://arxiv.org/abs/2104.13727v1
- Date: Wed, 28 Apr 2021 12:25:27 GMT
- Title: PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with
Many Symbols
- Authors: Songlin Yang, Yanpeng Zhao, Kewei Tu
- Abstract summary: We present a new parameterization form of PCFGs based on tensor decomposition.
We use neural parameterization for the new form to improve unsupervised parsing performance.
We evaluate our model across ten languages and empirically demonstrate the effectiveness of using more symbols.
- Score: 22.728124473130876
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Probabilistic context-free grammars (PCFGs) with neural parameterization have
been shown to be effective in unsupervised phrase-structure grammar induction.
However, due to the cubic computational complexity of PCFG representation and
parsing, previous approaches cannot scale up to a relatively large number of
(nonterminal and preterminal) symbols. In this work, we present a new
parameterization form of PCFGs based on tensor decomposition, which has at most
quadratic computational complexity in the symbol number and therefore allows us
to use a much larger number of symbols. We further use neural parameterization
for the new form to improve unsupervised parsing performance. We evaluate our
model across ten languages and empirically demonstrate the effectiveness of
using more symbols. Our code: https://github.com/sustcsonglin/TN-PCFG
Related papers
- Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers.
It is common to instead use proxy tasks that are similar in only an informal sense.
We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z) - On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning [87.73401758641089]
Chain-of-thought (CoT) reasoning has improved the performance of modern language models (LMs)
We show that LMs can represent the same family of distributions over strings as probabilistic Turing machines.
arXiv Detail & Related papers (2024-06-20T10:59:02Z) - Simple Hardware-Efficient PCFGs with Independent Left and Right
Productions [77.12660133995362]
This work introduces emphSimplePCFG, a simple PCFG formalism with independent left and right productions.
As an unsupervised algorithm, our simple PCFG obtains an average F1 of 65.1 on the English PTB, and as a language model, it obtains a perplexity of 119.0, outperforming similarly-sized low-rank PCFGs.
arXiv Detail & Related papers (2023-10-23T14:48:51Z) - Unsupervised Discontinuous Constituency Parsing with Mildly
Context-Sensitive Grammars [14.256041558454786]
We study grammar induction with mildly context-sensitive grammars for unsupervised discontinuous parsing.
Our approach fixes the rule structure in advance and focuses on parameter learning with maximum likelihood.
Experiments on German and Dutch show that our approach is able to induce linguistically meaningful trees with continuous and discontinuous structures.
arXiv Detail & Related papers (2022-12-18T18:10:45Z) - A Neural Model for Regular Grammar Induction [8.873449722727026]
We treat grammars as a model of computation and propose a novel neural approach to induction of regular grammars from positive and negative examples.
Our model is fully explainable, its intermediate results are directly interpretable as partial parses, and it can be used to learn arbitrary regular grammars when provided with sufficient data.
arXiv Detail & Related papers (2022-09-23T14:53:23Z) - Statistically Meaningful Approximation: a Case Study on Approximating
Turing Machines with Transformers [50.85524803885483]
This work proposes a formal definition of statistically meaningful (SM) approximation which requires the approximating network to exhibit good statistical learnability.
We study SM approximation for two function classes: circuits and Turing machines.
arXiv Detail & Related papers (2021-07-28T04:28:55Z) - The Limitations of Limited Context for Constituency Parsing [27.271792317099045]
Parsing-Reading-Predict architecture of (Shen et al., 2018a) was first to perform unsupervised syntactic parsing.
What kind of syntactic structure can current neural approaches to syntax represent?
We ground this question in the sandbox of probabilistic-free-grammars (PCFGs)
We identify a key aspect of the representational power of these approaches: the amount and directionality of context that the predictor has access to.
arXiv Detail & Related papers (2021-06-03T03:58:35Z) - CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language
Representation [12.005340904206697]
CANINE is a neural encoder that operates directly on character sequences without explicit tokenization or vocabulary.
CanINE outperforms a comparable mBERT model by >= 1 F1 on TyDi QA, a challenging multilingual benchmark.
arXiv Detail & Related papers (2021-03-11T18:57:44Z) - The Return of Lexical Dependencies: Neural Lexicalized PCFGs [103.41187595153652]
We present novel neural models of lexicalized PCFGs which allow us to overcome sparsity problems.
Experiments demonstrate that this unified framework results in stronger results on both representations than achieved when either formalism alone.
arXiv Detail & Related papers (2020-07-29T22:12:49Z) - Bootstrapping Techniques for Polysynthetic Morphological Analysis [9.655349059913888]
We offer linguistically-informed approaches for bootstrapping a neural morphological analyzer.
We generate data from a finite state transducer to train an encoder-decoder model.
We improve the model by "hallucinating" missing linguistic structure into the training data, and by resampling from a Zipf distribution to simulate a more natural distribution of morphemes.
arXiv Detail & Related papers (2020-05-03T00:35:19Z) - Multi-Step Inference for Reasoning Over Paragraphs [95.91527524872832]
Complex reasoning over text requires understanding and chaining together free-form predicates and logical connectives.
We present a compositional model reminiscent of neural module networks that can perform chained logical reasoning.
arXiv Detail & Related papers (2020-04-06T21:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.