Combining Self-Training and Self-Supervised Learning for Unsupervised
Disfluency Detection
- URL: http://arxiv.org/abs/2010.15360v1
- Date: Thu, 29 Oct 2020 05:29:26 GMT
- Title: Combining Self-Training and Self-Supervised Learning for Unsupervised
Disfluency Detection
- Authors: Shaolei Wang, Zhongyuan Wang, Wanxiang Che, Ting Liu
- Abstract summary: In this work, we explore the unsupervised learning paradigm which can potentially work with unlabeled text corpora.
Our model builds upon the recent work on Noisy Student Training, a semi-supervised learning approach that extends the idea of self-training.
- Score: 80.68446022994492
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most existing approaches to disfluency detection heavily rely on
human-annotated corpora, which is expensive to obtain in practice. There have
been several proposals to alleviate this issue with, for instance,
self-supervised learning techniques, but they still require human-annotated
corpora. In this work, we explore the unsupervised learning paradigm which can
potentially work with unlabeled text corpora that are cheaper and easier to
obtain. Our model builds upon the recent work on Noisy Student Training, a
semi-supervised learning approach that extends the idea of self-training.
Experimental results on the commonly used English Switchboard test set show
that our approach achieves competitive performance compared to the previous
state-of-the-art supervised systems using contextualized word embeddings (e.g.
BERT and ELECTRA).
Related papers
- Mean BERTs make erratic language teachers: the effectiveness of latent
bootstrapping in low-resource settings [5.121744234312891]
latent bootstrapping is an alternative self-supervision technique for pretraining language models.
We conduct experiments to assess how effective this approach is for acquiring linguistic knowledge from limited resources.
arXiv Detail & Related papers (2023-10-30T10:31:32Z) - Semi-supervised learning made simple with self-supervised clustering [65.98152950607707]
Self-supervised learning models have been shown to learn rich visual representations without requiring human annotations.
We propose a conceptually simple yet empirically powerful approach to turn clustering-based self-supervised methods into semi-supervised learners.
arXiv Detail & Related papers (2023-06-13T01:09:18Z) - Weakly-supervised HOI Detection via Prior-guided Bi-level Representation
Learning [66.00600682711995]
Human object interaction (HOI) detection plays a crucial role in human-centric scene understanding and serves as a fundamental building-block for many vision tasks.
One generalizable and scalable strategy for HOI detection is to use weak supervision, learning from image-level annotations only.
This is inherently challenging due to ambiguous human-object associations, large search space of detecting HOIs and highly noisy training signal.
We develop a CLIP-guided HOI representation capable of incorporating the prior knowledge at both image level and HOI instance level, and adopt a self-taught mechanism to prune incorrect human-object associations.
arXiv Detail & Related papers (2023-03-02T14:41:31Z) - Generative or Contrastive? Phrase Reconstruction for Better Sentence
Representation Learning [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning may yield powerful enough sentence representation and achieve performance in Sentence Textual Similarity tasks on par with contrastive learning.
arXiv Detail & Related papers (2022-04-20T10:00:46Z) - Adversarial Contrastive Self-Supervised Learning [13.534367890379853]
We present a novel self-supervised deep learning paradigm based on online hard negative pair mining.
We derive a new triplet-like loss considering both positive sample pairs and mined hard negative sample pairs.
arXiv Detail & Related papers (2022-02-26T05:57:45Z) - Co$^2$L: Contrastive Continual Learning [69.46643497220586]
Recent breakthroughs in self-supervised learning show that such algorithms learn visual representations that can be transferred better to unseen tasks.
We propose a rehearsal-based continual learning algorithm that focuses on continually learning and maintaining transferable representations.
arXiv Detail & Related papers (2021-06-28T06:14:38Z) - Understand and Improve Contrastive Learning Methods for Visual
Representation: A Review [1.4650545418986058]
A promising alternative, self-supervised learning, has gained popularity because of its potential to learn effective data representations without manual labeling.
This literature review aims to provide an up-to-date analysis of the efforts of researchers to understand the key components and the limitations of self-supervised learning.
arXiv Detail & Related papers (2021-06-06T21:59:49Z) - Crowdsourcing Learning as Domain Adaptation: A Case Study on Named
Entity Recognition [19.379850806513232]
We take a different point in this work, regarding all crowdsourced annotations as gold-standard with respect to the individual annotators.
We find that crowdsourcing could be highly similar to domain adaptation, and then the recent advances of cross-domain methods can be almost directly applied to crowdsourcing.
We investigate both unsupervised and supervised crowdsourcing learning, assuming that no or only small-scale expert annotations are available.
arXiv Detail & Related papers (2021-05-31T14:11:08Z) - Can Semantic Labels Assist Self-Supervised Visual Representation
Learning? [194.1681088693248]
We present a new algorithm named Supervised Contrastive Adjustment in Neighborhood (SCAN)
In a series of downstream tasks, SCAN achieves superior performance compared to previous fully-supervised and self-supervised methods.
Our study reveals that semantic labels are useful in assisting self-supervised methods, opening a new direction for the community.
arXiv Detail & Related papers (2020-11-17T13:25:00Z) - Detecting Human-Object Interaction with Mixed Supervision [0.0]
Human object interaction (HOI) detection is an important task in image understanding and reasoning.
We propose a mixed-supervised HOI detection pipeline: thanks to a specific design of momentum-independent learning.
Our method is evaluated on the challenging HICO-DET dataset.
arXiv Detail & Related papers (2020-11-10T08:42:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.