Topic-driven Distant Supervision Framework for Macro-level Discourse
Parsing
- URL: http://arxiv.org/abs/2305.13755v1
- Date: Tue, 23 May 2023 07:13:51 GMT
- Title: Topic-driven Distant Supervision Framework for Macro-level Discourse
Parsing
- Authors: Feng Jiang, Longwang He, Peifeng Li, Qiaoming Zhu, Haizhou Li
- Abstract summary: The task of analyzing the internal rhetorical structure of texts is a challenging problem in natural language processing.
Despite the recent advances in neural models, the lack of large-scale, high-quality corpora for training remains a major obstacle.
Recent studies have attempted to overcome this limitation by using distant supervision.
- Score: 72.14449502499535
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Discourse parsing, the task of analyzing the internal rhetorical structure of
texts, is a challenging problem in natural language processing. Despite the
recent advances in neural models, the lack of large-scale, high-quality corpora
for training remains a major obstacle. Recent studies have attempted to
overcome this limitation by using distant supervision, which utilizes results
from other NLP tasks (e.g., sentiment polarity, attention matrix, and
segmentation probability) to parse discourse trees. However, these methods do
not take into account the differences between in-domain and out-of-domain
tasks, resulting in lower performance and inability to leverage the
high-quality in-domain data for further improvement. To address these issues,
we propose a distant supervision framework that leverages the relations between
topic structure and rhetorical structure. Specifically, we propose two
distantly supervised methods, based on transfer learning and the
teacher-student model, that narrow the gap between in-domain and out-of-domain
tasks through label mapping and oracle annotation. Experimental results on the
MCDTB and RST-DT datasets show that our methods achieve the best performance in
both distant-supervised and supervised scenarios.
Related papers
- Sequential Visual and Semantic Consistency for Semi-supervised Text
Recognition [56.968108142307976]
Scene text recognition (STR) is a challenging task that requires large-scale annotated data for training.
Most existing STR methods resort to synthetic data, which may introduce domain discrepancy and degrade the performance of STR models.
This paper proposes a novel semi-supervised learning method for STR that incorporates word-level consistency regularization from both visual and semantic aspects.
arXiv Detail & Related papers (2024-02-24T13:00:54Z) - Unsupervised Chunking with Hierarchical RNN [62.15060807493364]
This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner.
We present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions.
Experiments on the CoNLL-2000 dataset reveal a notable improvement over existing unsupervised methods, enhancing phrase F1 score by up to 6 percentage points.
arXiv Detail & Related papers (2023-09-10T02:55:12Z) - Syntax-Guided Domain Adaptation for Aspect-based Sentiment Analysis [23.883810236153757]
Domain adaptation is a popular solution to alleviate the data deficiency issue in new domains by transferring common knowledge across domains.
We propose a novel Syntax-guided Domain Adaptation Model, named SDAM, for more effective cross-domain ABSA.
Our model consistently outperforms the state-of-the-art baselines with respect to Micro-F1 metric for the cross-domain End2End ABSA task.
arXiv Detail & Related papers (2022-11-10T10:09:33Z) - Towards Domain-Independent Supervised Discourse Parsing Through Gradient
Boosting [30.615883375573432]
We present a new, supervised paradigm directly tackling the domain adaptation issue in discourse parsing.
Specifically, we introduce the first fully supervised discourse framework designed to alleviate the domain dependency through a staged model of weak gradient classifiers.
arXiv Detail & Related papers (2022-10-18T03:44:27Z) - Feature Representation Learning for Unsupervised Cross-domain Image
Retrieval [73.3152060987961]
Current supervised cross-domain image retrieval methods can achieve excellent performance.
The cost of data collection and labeling imposes an intractable barrier to practical deployment in real applications.
We introduce a new cluster-wise contrastive learning mechanism to help extract class semantic-aware features.
arXiv Detail & Related papers (2022-07-20T07:52:14Z) - Predicting Above-Sentence Discourse Structure using Distant Supervision
from Topic Segmentation [8.688675709130289]
RST-style discourse parsing plays a vital role in many NLP tasks.
Despite its importance, one of the most prevailing limitations in modern day discourse parsing is the lack of large-scale datasets.
arXiv Detail & Related papers (2021-12-12T10:16:45Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Cross-domain Imitation from Observations [50.669343548588294]
Imitation learning seeks to circumvent the difficulty in designing proper reward functions for training agents by utilizing expert behavior.
In this paper, we study the problem of how to imitate tasks when there exist discrepancies between the expert and agent MDP.
We present a novel framework to learn correspondences across such domains.
arXiv Detail & Related papers (2021-05-20T21:08:25Z) - Learning a Domain-Agnostic Visual Representation for Autonomous Driving
via Contrastive Loss [25.798361683744684]
Domain-Agnostic Contrastive Learning (DACL) is a two-stage unsupervised domain adaptation framework with cyclic adversarial training and contrastive loss.
Our proposed approach achieves better performance in the monocular depth estimation task compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-10T07:06:03Z) - Coupling Distant Annotation and Adversarial Training for Cross-Domain
Chinese Word Segmentation [40.27961925319402]
This paper proposes to couple distant annotation and adversarial training for cross-domain Chinese word segmentation.
For distant annotation, we design an automatic distant annotation mechanism that does not need any supervision or pre-defined dictionaries from the target domain.
For adversarial training, we develop a sentence-level training procedure to perform noise reduction and maximum utilization of the source domain information.
arXiv Detail & Related papers (2020-07-16T08:54:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.