The Limitations of Limited Context for Constituency Parsing
- URL: http://arxiv.org/abs/2106.01580v1
- Date: Thu, 3 Jun 2021 03:58:35 GMT
- Title: The Limitations of Limited Context for Constituency Parsing
- Authors: Yuchen Li, Andrej Risteski
- Abstract summary: Parsing-Reading-Predict architecture of (Shen et al., 2018a) was first to perform unsupervised syntactic parsing.
What kind of syntactic structure can current neural approaches to syntax represent?
We ground this question in the sandbox of probabilistic-free-grammars (PCFGs)
We identify a key aspect of the representational power of these approaches: the amount and directionality of context that the predictor has access to.
- Score: 27.271792317099045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Incorporating syntax into neural approaches in NLP has a multitude of
practical and scientific benefits. For instance, a language model that is
syntax-aware is likely to be able to produce better samples; even a
discriminative model like BERT with a syntax module could be used for core NLP
tasks like unsupervised syntactic parsing. Rapid progress in recent years was
arguably spurred on by the empirical success of the Parsing-Reading-Predict
architecture of (Shen et al., 2018a), later simplified by the Order Neuron LSTM
of (Shen et al., 2019). Most notably, this is the first time neural approaches
were able to successfully perform unsupervised syntactic parsing (evaluated by
various metrics like F-1 score).
However, even heuristic (much less fully mathematical) understanding of why
and when these architectures work is lagging severely behind. In this work, we
answer representational questions raised by the architectures in (Shen et al.,
2018a, 2019), as well as some transition-based syntax-aware language models
(Dyer et al., 2016): what kind of syntactic structure can current neural
approaches to syntax represent? Concretely, we ground this question in the
sandbox of probabilistic context-free-grammars (PCFGs), and identify a key
aspect of the representational power of these approaches: the amount and
directionality of context that the predictor has access to when forced to make
parsing decision. We show that with limited context (either bounded, or
unidirectional), there are PCFGs, for which these approaches cannot represent
the max-likelihood parse; conversely, if the context is unlimited, they can
represent the max-likelihood parse of any PCFG.
Related papers
- Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers.
It is common to instead use proxy tasks that are similar in only an informal sense.
We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z) - A Truly Joint Neural Architecture for Segmentation and Parsing [15.866519123942457]
Performance of Morphologically Rich Languages (MRLs) is lower than other languages.
Due to high morphological complexity and ambiguity of the space-delimited input tokens, the linguistic units that act as nodes in the tree are not known in advance.
We introduce a joint neural architecture where a lattice-based representation preserving all morphological ambiguity of the input is provided to an arc-factored model, which then solves the morphological and syntactic parsing tasks at once.
arXiv Detail & Related papers (2024-02-04T16:56:08Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - LINC: A Neurosymbolic Approach for Logical Reasoning by Combining
Language Models with First-Order Logic Provers [60.009969929857704]
Logical reasoning is an important task for artificial intelligence with potential impacts on science, mathematics, and society.
In this work, we reformulating such tasks as modular neurosymbolic programming, which we call LINC.
We observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate.
arXiv Detail & Related papers (2023-10-23T17:58:40Z) - Unsupervised Chunking with Hierarchical RNN [62.15060807493364]
This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner.
We present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions.
Experiments on the CoNLL-2000 dataset reveal a notable improvement over existing unsupervised methods, enhancing phrase F1 score by up to 6 percentage points.
arXiv Detail & Related papers (2023-09-10T02:55:12Z) - Laziness Is a Virtue When It Comes to Compositionality in Neural
Semantic Parsing [20.856601758389544]
We introduce a neural semantic parsing generation method that constructs logical forms from the bottom up, beginning from the logical form's leaves.
We show that our novel, bottom-up parsing semantic technique outperforms general-purpose semantics while also being competitive with comparable neurals.
arXiv Detail & Related papers (2023-05-07T17:53:08Z) - Structural generalization is hard for sequence-to-sequence models [85.0087839979613]
Sequence-to-sequence (seq2seq) models have been successful across many NLP tasks.
Recent work on compositional generalization has shown that seq2seq models achieve very low accuracy in generalizing to linguistic structures that were not seen in training.
arXiv Detail & Related papers (2022-10-24T09:03:03Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Generating Synthetic Data for Task-Oriented Semantic Parsing with
Hierarchical Representations [0.8203855808943658]
In this work, we explore the possibility of generating synthetic data for neural semantic parsing.
Specifically, we first extract masked templates from the existing labeled utterances, and then fine-tune BART to generate synthetic utterances conditioning.
We show the potential of our approach when evaluating on the Facebook TOP dataset for navigation domain.
arXiv Detail & Related papers (2020-11-03T22:55:40Z) - Discontinuous Constituent Parsing with Pointer Networks [0.34376560669160383]
discontinuous constituent trees are crucial for representing all grammatical phenomena of languages such as German.
Recent advances in dependency parsing have shown that Pointer Networks excel in efficiently parsing syntactic relations between words in a sentence.
We propose a novel neural network architecture that is able to generate the most accurate discontinuous constituent representations.
arXiv Detail & Related papers (2020-02-05T15:12:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.