Related papers: Llamipa: An Incremental Discourse Parser

Llamipa: An Incremental Discourse Parser

URL: http://arxiv.org/abs/2406.18256v3
Date: Thu, 03 Oct 2024 15:48:31 GMT
Title: Llamipa: An Incremental Discourse Parser
Authors: Kate Thompson, Akshay Chaturvedi, Julie Hunter, Nicholas Asher,
Abstract summary: This paper provides the first discourse parsing experiments with a large language model finetuned on corpora in the style of SDRT. It can process discourse data, which is essential for the eventual use of discourse information in downstream tasks.
Score: 6.9534924995446055
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper provides the first discourse parsing experiments with a large language model(LLM) finetuned on corpora annotated in the style of SDRT (Segmented Discourse Representation Theory Asher, 1993; Asher and Lascarides, 2003). The result is a discourse parser, Llamipa (Llama Incremental Parser), that leverages discourse context, leading to substantial performance gains over approaches that use encoder-only models to provide local, context-sensitive representations of discourse units. Furthermore, it can process discourse data incrementally, which is essential for the eventual use of discourse information in downstream tasks.

Related papers

STAB: Speech Tokenizer Assessment Benchmark [57.45234921100835]
Representing speech as discrete tokens provides a framework for transforming speech into a format that closely resembles text. We present STAB (Speech Tokenizer Assessment Benchmark), a systematic evaluation framework designed to assess speech tokenizers comprehensively. We evaluate the STAB metrics and correlate this with downstream task performance across a range of speech tasks and tokenizer choices.
arXiv Detail & Related papers (2024-09-04T02:20:59Z)
DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment [82.86363991170546]
We propose a Descriptive Speech-Text Alignment approach that leverages speech captioning to bridge the gap between speech and text modalities. Our model demonstrates superior performance on the Dynamic-SUPERB benchmark, particularly in generalizing to unseen tasks. These findings highlight the potential to reshape instruction-following SLMs by incorporating descriptive rich, speech captions.
arXiv Detail & Related papers (2024-06-27T03:52:35Z)
Growing Trees on Sounds: Assessing Strategies for End-to-End Dependency Parsing of Speech [8.550564152063522]
We report on a set of experiments aiming at assessing the performance of two parsing paradigms on speech parsing. We perform this evaluation on a large treebank of spoken French, featuring realistic spontaneous conversations. Our findings show that (i) the graph based approach obtains better results across the board (ii) parsing directly from speech outperforms a pipeline approach, despite having 30% fewer parameters.
arXiv Detail & Related papers (2024-06-18T13:46:10Z)
Generative Context-aware Fine-tuning of Self-supervised Speech Models [54.389711404209415]
We study the use of generative large language models (LLM) generated context information. We propose an approach to distill the generated information during fine-tuning of self-supervised speech models. We evaluate the proposed approach using the SLUE and Libri-light benchmarks for several downstream tasks: automatic speech recognition, named entity recognition, and sentiment analysis.
arXiv Detail & Related papers (2023-12-15T15:46:02Z)
Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation [53.01238689626378]
We propose a novel approach to leverage semantic information in speaker diarization systems. We introduce spoken language understanding modules to extract speaker-related semantic information. We present a novel framework to integrate these constraints into the speaker diarization pipeline.
arXiv Detail & Related papers (2023-09-19T09:13:30Z)
Understanding Shared Speech-Text Representations [34.45772613231558]
Mae-stro has developed approaches to train speech models by incorpo-rating text into end-to-end models. We find that a corpus-specific duration modelfor speech-text alignment is the most important component for learn-ing a shared speech-text representation. We find that theshared encoder learns a more compact and overlapping speech-textrepresentation than the uni-modal encoders.
arXiv Detail & Related papers (2023-04-27T20:05:36Z)
Towards Domain-Independent Supervised Discourse Parsing Through Gradient Boosting [30.615883375573432]
We present a new, supervised paradigm directly tackling the domain adaptation issue in discourse parsing. Specifically, we introduce the first fully supervised discourse framework designed to alleviate the domain dependency through a staged model of weak gradient classifiers.
arXiv Detail & Related papers (2022-10-18T03:44:27Z)
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion [54.29557210925752]
One-shot voice conversion can be effectively achieved by speech representation disentanglement. We employ vector quantization (VQ) for content encoding and introduce mutual information (MI) as the correlation metric during training. Experimental results reflect the superiority of the proposed method in learning effective disentangled speech representations.
arXiv Detail & Related papers (2021-06-18T13:50:38Z)
Unleashing the Power of Neural Discourse Parsers -- A Context and Structure Aware Approach Using Large Scale Pretraining [26.517219486173598]
RST-based discourse parsing is an important NLP task with numerous downstream applications, such as summarization, machine translation and opinion mining. In this paper, we demonstrate a simple, yet highly accurate discourse parsing, incorporating recent contextual language models. Our establishes the new state-of-the-art (SOTA) performance for predicting structure and nuclearity on two key RST datasets, RST-DT and Instr-DT.
arXiv Detail & Related papers (2020-11-06T06:11:26Z)
Bridging the Modality Gap for Speech-to-Text Translation [57.47099674461832]
End-to-end speech translation aims to translate speech in one language into text in another language via an end-to-end way. Most existing methods employ an encoder-decoder structure with a single encoder to learn acoustic representation and semantic information simultaneously. We propose a Speech-to-Text Adaptation for Speech Translation model which aims to improve the end-to-end model performance by bridging the modality gap between speech and text.
arXiv Detail & Related papers (2020-10-28T12:33:04Z)
An Effective Contextual Language Modeling Framework for Speech Summarization with Augmented Features [13.97006782398121]
Bidirectional Representations from Transformers (BERT) model was proposed and has achieved record-breaking success on many natural language processing tasks. We explore the incorporation of confidence scores into sentence representations to see if such an attempt could help alleviate the negative effects caused by imperfect automatic speech recognition. We validate the effectiveness of our proposed method on a benchmark dataset.
arXiv Detail & Related papers (2020-06-01T18:27:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.