Related papers: A Conditional Splitting Framework for Efficient Constituency Parsing

A Conditional Splitting Framework for Efficient Constituency Parsing

URL: http://arxiv.org/abs/2106.15760v1
Date: Wed, 30 Jun 2021 00:36:34 GMT
Title: A Conditional Splitting Framework for Efficient Constituency Parsing
Authors: Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, Xiaoli Li
Abstract summary: We introduce a generic seq2seq parsing framework that casts constituency parsing problems (syntactic and discourse parsing) into a series of conditional splitting decisions. Our parsing model estimates the conditional probability distribution of possible splitting points in a given text span and supports efficient top-down decoding. For discourse analysis we show that in our formulation, discourse segmentation can be framed as a special case of parsing.
Score: 14.548146390081778
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a generic seq2seq parsing framework that casts constituency parsing problems (syntactic and discourse parsing) into a series of conditional splitting decisions. Our parsing model estimates the conditional probability distribution of possible splitting points in a given text span and supports efficient top-down decoding, which is linear in number of nodes. The conditional splitting formulation together with efficient beam search inference facilitate structural consistency without relying on expensive structured inference. Crucially, for discourse analysis we show that in our formulation, discourse segmentation can be framed as a special case of parsing which allows us to perform discourse parsing without requiring segmentation as a pre-requisite. Experiments show that our model achieves good results on the standard syntactic parsing tasks under settings with/without pre-trained representations and rivals state-of-the-art (SoTA) methods that are more computationally expensive than ours. In discourse parsing, our method outperforms SoTA by a good margin.

Related papers

Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing [43.758912958903494]
We present the structured average intersection-over-union ratio (STRUCT-IOU), a similarity metric between constituency parse trees motivated by the problem of evaluating speechs. To compute the metric, we project the ground-truth parse tree to the speech domain by forced alignment, align the projected ground-truth constituents with the predicted ones under certain structured constraints, and calculate the average IOU score across all aligned constituent pairs.
arXiv Detail & Related papers (2024-02-21T00:01:17Z)
Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective [97.57162770792182]
Tasks that model the relation between pairs of tokens in a string are a vital part of understanding natural language. We show that these exhaustive comparisons can be avoided, and, moreover, the complexity can be reduced to linear by casting the relation between tokens as a partial order over the string. Our method predicts real numbers for each token in a string in parallel and sorts the tokens accordingly, resulting in total orders of the tokens in the string.
arXiv Detail & Related papers (2023-05-24T11:47:35Z)
Retrieve-and-Fill for Scenario-based Task-Oriented Semantic Parsing [110.4684789199555]
We introduce scenario-based semantic parsing: a variant of the original task which first requires disambiguating an utterance's "scenario" This formulation enables us to isolate coarse-grained and fine-grained aspects of the task, each of which we solve with off-the-shelf neural modules. Our model is modular, differentiable, interpretable, and allows us to garner extra supervision from scenarios.
arXiv Detail & Related papers (2022-02-02T08:00:21Z)
CPTAM: Constituency Parse Tree Aggregation Method [6.011216641982612]
This paper adopts the truth discovery idea to aggregate constituency parse trees from different distances. We formulate the constituency parse tree aggregation problem in two steps, structure aggregation and constituent label aggregation. Experiments are conducted on benchmark datasets in different languages and domains.
arXiv Detail & Related papers (2022-01-19T23:05:37Z)
Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding [101.24748444126982]
Decomposable tasks are complex and comprise of a hierarchy of sub-tasks. Existing benchmarks, however, typically hold out examples for only the surface-level sub-task. We propose a framework to construct robust test sets using coordinate ascent over sub-task specific utility functions.
arXiv Detail & Related papers (2021-06-29T02:53:59Z)
Context-Preserving Text Simplification [11.830061911323025]
We present a context-preserving text simplification (TS) approach that splits and rephrases complex English sentences into a semantic hierarchy of simplified sentences. Using a set of linguistically principled transformation patterns, input sentences are converted into a hierarchical representation in the form of core sentences and accompanying contexts that are linked via rhetorical relations. A comparative analysis with the annotations contained in the RST-DT shows that we are able to capture the contextual hierarchy between the split sentences with a precision of 89% and reach an average precision of 69% for the classification of the rhetorical relations that hold between them.
arXiv Detail & Related papers (2021-05-24T09:54:56Z)
RST Parsing from Scratch [14.548146390081778]
We introduce a novel end-to-end formulation of document-level discourse parsing in the Rhetorical Structure Theory (RST) framework. Our framework facilitates discourse parsing from scratch without requiring discourse segmentation as a prerequisite. Our unified parsing model adopts a beam search to decode the best tree structure by searching through a space of high-scoring trees.
arXiv Detail & Related papers (2021-05-23T06:19:38Z)
Span-based Semantic Parsing for Compositional Generalization [53.24255235340056]
SpanBasedSP predicts a span tree over an input utterance, explicitly encoding how partial programs compose over spans in the input. On GeoQuery, SCAN and CLOSURE, SpanBasedSP performs similarly to strong seq2seq baselines on random splits, but dramatically improves performance compared to baselines on splits that require compositional generalization.
arXiv Detail & Related papers (2020-09-13T16:42:18Z)
A Simple Global Neural Discourse Parser [61.728994693410954]
We propose a simple chart-based neural discourse that does not require any manually-crafted features and is based on learned span representations only. We empirically demonstrate that our model achieves the best performance among globals, and comparable performance to state-of-art greedys.
arXiv Detail & Related papers (2020-09-02T19:28:40Z)
Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems. We formulate the extractive summarization task as a semantic text matching problem. We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.