Prosodic features improve sentence segmentation and parsing
- URL: http://arxiv.org/abs/2302.12165v1
- Date: Thu, 23 Feb 2023 17:03:36 GMT
- Title: Prosodic features improve sentence segmentation and parsing
- Authors: Elizabeth Nielsen, Sharon Goldwater, Mark Steedman
- Abstract summary: We show the effect of prosody on parsing speech that isn't segmented into sentences.
We find prosody helps our model both with parsing and accurately identifying sentence boundaries.
- Score: 28.41406899452548
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Parsing spoken dialogue presents challenges that parsing text does not,
including a lack of clear sentence boundaries. We know from previous work that
prosody helps in parsing single sentences (Tran et al. 2018), but we want to
show the effect of prosody on parsing speech that isn't segmented into
sentences. In experiments on the English Switchboard corpus, we find prosody
helps our model both with parsing and with accurately identifying sentence
boundaries. However, we find that the best-performing parser is not necessarily
the parser that produces the best sentence segmentation performance. We suggest
that the best parses instead come from modelling sentence boundaries jointly
with other constituent boundaries.
Related papers
- What's Hard in English RST Parsing? Predictive Models for Error Analysis [16.927386793787463]
In this paper, we examine and model some of the factors associated with parsing difficulties in Rhetorical Structure Theory.
Our results show that as in shallow discourse parsing, the explicit/implicit distinction plays a role, but that long-distance dependencies are the main challenge.
Our final model is able to predict where errors will occur with an accuracy of 76.3% for the bottom-up and 76.6% for the top-down.
arXiv Detail & Related papers (2023-09-10T06:10:03Z) - Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic
Sentence Segmentation [65.6736056006381]
We present a multilingual punctuation-agnostic sentence segmentation method covering 85 languages.
Our method outperforms all the prior best sentence-segmentation tools by an average of 6.1% F1 points.
By using our method to match sentence segmentation to the segmentation used during training of MT models, we achieve an average improvement of 2.3 BLEU points.
arXiv Detail & Related papers (2023-05-30T09:49:42Z) - Cascading and Direct Approaches to Unsupervised Constituency Parsing on
Spoken Sentences [67.37544997614646]
We present the first study on unsupervised spoken constituency parsing.
The goal is to determine the spoken sentences' hierarchical syntactic structure in the form of constituency parse trees.
We show that accurate segmentation alone may be sufficient to parse spoken sentences accurately.
arXiv Detail & Related papers (2023-03-15T17:57:22Z) - Clustering and Network Analysis for the Embedding Spaces of Sentences
and Sub-Sentences [69.3939291118954]
This paper reports research on a set of comprehensive clustering and network analyses targeting sentence and sub-sentence embedding spaces.
Results show that one method generates the most clusterable embeddings.
In general, the embeddings of span sub-sentences have better clustering properties than the original sentences.
arXiv Detail & Related papers (2021-10-02T00:47:35Z) - A Conditional Splitting Framework for Efficient Constituency Parsing [14.548146390081778]
We introduce a generic seq2seq parsing framework that casts constituency parsing problems (syntactic and discourse parsing) into a series of conditional splitting decisions.
Our parsing model estimates the conditional probability distribution of possible splitting points in a given text span and supports efficient top-down decoding.
For discourse analysis we show that in our formulation, discourse segmentation can be framed as a special case of parsing.
arXiv Detail & Related papers (2021-06-30T00:36:34Z) - Prosodic segmentation for parsing spoken dialogue [29.68201160277817]
Parsing spoken dialogue poses unique difficulties, including disfluencies and unmarked boundaries.
Previous work has shown that prosody can help with parsing disfluent speech.
We show that prosody can effectively replace gold standard SU boundaries.
arXiv Detail & Related papers (2021-05-26T16:30:16Z) - Narrative Incoherence Detection [76.43894977558811]
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding.
Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
arXiv Detail & Related papers (2020-12-21T07:18:08Z) - A Simple Global Neural Discourse Parser [61.728994693410954]
We propose a simple chart-based neural discourse that does not require any manually-crafted features and is based on learned span representations only.
We empirically demonstrate that our model achieves the best performance among globals, and comparable performance to state-of-art greedys.
arXiv Detail & Related papers (2020-09-02T19:28:40Z) - Toward Better Storylines with Sentence-Level Language Models [54.91921545103256]
We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives.
We demonstrate the effectiveness of our approach with state-of-the-art accuracy on the unsupervised Story Cloze task.
arXiv Detail & Related papers (2020-05-11T16:54:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.