A Pilot Study on Dialogue-Level Dependency Parsing for Chinese
- URL: http://arxiv.org/abs/2305.12441v2
- Date: Thu, 1 Jun 2023 03:03:43 GMT
- Title: A Pilot Study on Dialogue-Level Dependency Parsing for Chinese
- Authors: Gongyao Jiang, Shuang Liu, Meishan Zhang, Min Zhang
- Abstract summary: We develop a high-quality human-annotated corpus, which contains 850 dialogues and 199,803 dependencies.
Considering that such tasks suffer from high annotation costs, we investigate zero-shot and few-shot scenarios.
Based on an existing syntactic treebank, we adopt a signal-based method to transform seen syntactic dependencies into unseen ones.
- Score: 21.698966896156087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dialogue-level dependency parsing has received insufficient attention,
especially for Chinese. To this end, we draw on ideas from syntactic dependency
and rhetorical structure theory (RST), developing a high-quality
human-annotated corpus, which contains 850 dialogues and 199,803 dependencies.
Considering that such tasks suffer from high annotation costs, we investigate
zero-shot and few-shot scenarios. Based on an existing syntactic treebank, we
adopt a signal-based method to transform seen syntactic dependencies into
unseen ones between elementary discourse units (EDUs), where the signals are
detected by masked language modeling. Besides, we apply single-view and
multi-view data selection to access reliable pseudo-labeled instances.
Experimental results show the effectiveness of these baselines. Moreover, we
discuss several crucial points about our dataset and approach.
Related papers
- A Hybrid Approach To Aspect Based Sentiment Analysis Using Transfer Learning [3.30307212568497]
We propose a hybrid approach for Aspect Based Sentiment Analysis using transfer learning.
The approach focuses on generating weakly-supervised annotations by exploiting the strengths of both large language models (LLM) and traditional syntactic dependencies.
arXiv Detail & Related papers (2024-03-25T23:02:33Z) - Learning Disentangled Speech Representations [0.412484724941528]
SynSpeech is a novel large-scale synthetic speech dataset designed to enable research on disentangled speech representations.
We present a framework to evaluate disentangled representation learning techniques, applying both linear probing and established supervised disentanglement metrics.
We find that SynSpeech facilitates benchmarking across a range of factors, achieving promising disentanglement of simpler features like gender and speaking style, while highlighting challenges in isolating complex attributes like speaker identity.
arXiv Detail & Related papers (2023-11-04T04:54:17Z) - Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language
Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks.
Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena.
For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - Unsupervised Dialogue Topic Segmentation with Topic-aware Utterance
Representation [51.22712675266523]
Dialogue Topic (DTS) plays an essential role in a variety of dialogue modeling tasks.
We propose a novel unsupervised DTS framework, which learns topic-aware utterance representations from unlabeled dialogue data.
arXiv Detail & Related papers (2023-05-04T11:35:23Z) - Towards Transparent Interactive Semantic Parsing via Step-by-Step
Correction [17.000283696243564]
We investigate an interactive semantic parsing framework that explains the predicted logical form step by step in natural language.
We focus on question answering over knowledge bases (KBQA) as an instantiation of our framework.
Our experiments show that the interactive framework with human feedback has the potential to greatly improve overall parse accuracy.
arXiv Detail & Related papers (2021-10-15T20:11:22Z) - Unifying Discourse Resources with Dependency Framework [18.498060350460463]
We unify Chinese discourse corpora under different annotation schemes with discourse dependency framework.
We implement several benchmark dependencys and research on how they can leverage the unified data to improve performance.
arXiv Detail & Related papers (2021-01-01T05:23:29Z) - I like fish, especially dolphins: Addressing Contradictions in Dialogue
Modeling [104.09033240889106]
We introduce the DialoguE COntradiction DEtection task (DECODE) and a new conversational dataset containing both human-human and human-bot contradictory dialogues.
We then compare a structured utterance-based approach of using pre-trained Transformer models for contradiction detection with the typical unstructured approach.
arXiv Detail & Related papers (2020-12-24T18:47:49Z) - Multilingual Irony Detection with Dependency Syntax and Neural Models [61.32653485523036]
It focuses on the contribution from syntactic knowledge, exploiting linguistic resources where syntax is annotated according to the Universal Dependencies scheme.
The results suggest that fine-grained dependency-based syntactic information is informative for the detection of irony.
arXiv Detail & Related papers (2020-11-11T11:22:05Z) - Probing Task-Oriented Dialogue Representation from Language Models [106.02947285212132]
This paper investigates pre-trained language models to find out which model intrinsically carries the most informative representation for task-oriented dialogue tasks.
We fine-tune a feed-forward layer as the classifier probe on top of a fixed pre-trained language model with annotated labels in a supervised way.
arXiv Detail & Related papers (2020-10-26T21:34:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.