Unifying Discourse Resources with Dependency Framework
- URL: http://arxiv.org/abs/2101.00167v2
- Date: Tue, 19 Jan 2021 02:00:39 GMT
- Title: Unifying Discourse Resources with Dependency Framework
- Authors: Yi Cheng, Sujian Li, Yueyuan Li
- Abstract summary: We unify Chinese discourse corpora under different annotation schemes with discourse dependency framework.
We implement several benchmark dependencys and research on how they can leverage the unified data to improve performance.
- Score: 18.498060350460463
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For text-level discourse analysis, there are various discourse schemes but
relatively few labeled data, because discourse research is still immature and
it is labor-intensive to annotate the inner logic of a text. In this paper, we
attempt to unify multiple Chinese discourse corpora under different annotation
schemes with discourse dependency framework by designing semi-automatic methods
to convert them into dependency structures. We also implement several benchmark
dependency parsers and research on how they can leverage the unified data to
improve performance.
Related papers
- A Novel Dependency Framework for Enhancing Discourse Data Analysis [27.152245569974678]
This study has as its primary focus the conversion of PDTB annotations into dependency structures.
It employs refined BERT-based discourses to test the validity of the dependency data derived from the PDTB-style corpora in English, Chinese, and several other languages.
The results show that the PDTB dependency data is valid and that there is a strong correlation between the two types of dependency distance.
arXiv Detail & Related papers (2024-07-17T10:55:00Z) - Cross-domain Chinese Sentence Pattern Parsing [67.1381983012038]
Sentence Pattern Structure (SPS) parsing is a syntactic analysis method primarily employed in language teaching.
Existing SPSs rely heavily on textbook corpora for training, lacking cross-domain capability.
This paper proposes an innovative approach leveraging large language models (LLMs) within a self-training framework.
arXiv Detail & Related papers (2024-02-26T05:30:48Z) - Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language
Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks.
Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena.
For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z) - Revisiting Conversation Discourse for Dialogue Disentanglement [88.3386821205896]
We propose enhancing dialogue disentanglement by taking full advantage of the dialogue discourse characteristics.
We develop a structure-aware framework to integrate the rich structural features for better modeling the conversational semantic context.
Our work has great potential to facilitate broader multi-party multi-thread dialogue applications.
arXiv Detail & Related papers (2023-06-06T19:17:47Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - A Pilot Study on Dialogue-Level Dependency Parsing for Chinese [21.698966896156087]
We develop a high-quality human-annotated corpus, which contains 850 dialogues and 199,803 dependencies.
Considering that such tasks suffer from high annotation costs, we investigate zero-shot and few-shot scenarios.
Based on an existing syntactic treebank, we adopt a signal-based method to transform seen syntactic dependencies into unseen ones.
arXiv Detail & Related papers (2023-05-21T12:20:13Z) - Generic Dependency Modeling for Multi-Party Conversation [32.25605889407403]
We present an approach to encoding the dependencies in the form of relative dependency encoding (ReDE)
We show how to implement it in Transformers by modifying the computation of self-attention.
arXiv Detail & Related papers (2023-02-21T13:58:19Z) - Improve Discourse Dependency Parsing with Contextualized Representations [28.916249926065273]
We propose to take advantage of transformers to encode contextualized representations of units of different levels.
Motivated by the observation of writing patterns commonly shared across articles, we propose a novel method that treats discourse relation identification as a sequence labelling task.
arXiv Detail & Related papers (2022-05-04T14:35:38Z) - Improving Multi-Party Dialogue Discourse Parsing via Domain Integration [25.805553277418813]
Multi-party conversations are implicitly organized by semantic level correlations across the interactive turns.
dialogue discourse analysis can be applied to predict the dependency structure and relations between the elementary discourse units.
Existing corpora with dialogue discourse annotation are collected from specific domains with limited sample sizes.
arXiv Detail & Related papers (2021-10-09T09:36:22Z) - Dependency Induction Through the Lens of Visual Perception [81.91502968815746]
We propose an unsupervised grammar induction model that leverages word concreteness and a structural vision-based to jointly learn constituency-structure and dependency-structure grammars.
Our experiments show that the proposed extension outperforms the current state-of-the-art visually grounded models in constituency parsing even with a smaller grammar size.
arXiv Detail & Related papers (2021-09-20T18:40:37Z) - Reasoning in Dialog: Improving Response Generation by Context Reading
Comprehension [49.92173751203827]
In multi-turn dialog, utterances do not always take the full form of sentences.
We propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question.
arXiv Detail & Related papers (2020-12-14T10:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.