TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for
Unsupervised Sentence Embedding Learning
- URL: http://arxiv.org/abs/2104.06979v1
- Date: Wed, 14 Apr 2021 17:02:18 GMT
- Title: TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for
Unsupervised Sentence Embedding Learning
- Authors: Kexin Wang, Nils Reimers, Iryna Gurevych
- Abstract summary: We present a new state-of-the-art unsupervised method based on pre-trained Transformers and Sequential Denoising Auto-Encoder (TSDAE)
It can achieve up to 93.1% of the performance of in-domain supervised approaches.
- Score: 53.32740707197856
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Learning sentence embeddings often requires large amount of labeled data.
However, for most tasks and domains, labeled data is seldom available and
creating it is expensive. In this work, we present a new state-of-the-art
unsupervised method based on pre-trained Transformers and Sequential Denoising
Auto-Encoder (TSDAE) which outperforms previous approaches by up to 6.4 points.
It can achieve up to 93.1% of the performance of in-domain supervised
approaches. Further, we show that TSDAE is a strong pre-training method for
learning sentence embeddings, significantly outperforming other approaches like
Masked Language Model.
A crucial shortcoming of previous studies is the narrow evaluation: Most work
mainly evaluates on the single task of Semantic Textual Similarity (STS), which
does not require any domain knowledge. It is unclear if these proposed methods
generalize to other domains and tasks. We fill this gap and evaluate TSDAE and
other recent approaches on four different datasets from heterogeneous domains.
Related papers
- LE-UDA: Label-efficient unsupervised domain adaptation for medical image
segmentation [24.655779957716558]
We propose a novel and generic framework called Label-Efficient Unsupervised Domain Adaptation"(LE-UDA)
In LE-UDA, we construct self-ensembling consistency for knowledge transfer between both domains, as well as a self-ensembling adversarial learning module to achieve better feature alignment for UDA.
Experimental results demonstrate that the proposed LE-UDA can efficiently leverage limited source labels to improve cross-domain segmentation performance, outperforming state-of-the-art UDA approaches in the literature.
arXiv Detail & Related papers (2022-12-05T07:47:35Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Bi-level Alignment for Cross-Domain Crowd Counting [113.78303285148041]
Current methods rely on external data for training an auxiliary task or apply an expensive coarse-to-fine estimation.
We develop a new adversarial learning based method, which is simple and efficient to apply.
We evaluate our approach on five real-world crowd counting benchmarks, where we outperform existing approaches by a large margin.
arXiv Detail & Related papers (2022-05-12T02:23:25Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Data-efficient Weakly-supervised Learning for On-line Object Detection
under Domain Shift in Robotics [24.878465999976594]
Several object detection methods have been proposed in the literature, the vast majority based on Deep Convolutional Neural Networks (DCNNs)
These methods have important limitations for robotics: Learning solely on off-line data may introduce biases, and prevents adaptation to novel tasks.
In this work, we investigate how weakly-supervised learning can cope with these problems.
arXiv Detail & Related papers (2020-12-28T16:36:11Z) - Flexible deep transfer learning by separate feature embeddings and
manifold alignment [0.0]
Object recognition is a key enabler across industry and defense.
Unfortunately, algorithms trained on existing labeled datasets do not directly generalize to new data because the data distributions do not match.
We propose a novel deep learning framework that overcomes this limitation by learning separate feature extractions for each domain.
arXiv Detail & Related papers (2020-12-22T19:24:44Z) - Knowledge Distillation for BERT Unsupervised Domain Adaptation [2.969705152497174]
A pre-trained language model, BERT, has brought significant performance improvements across a range of natural language processing tasks.
We propose a simple but effective unsupervised domain adaptation method, adversarial adaptation with distillation (AAD)
We evaluate our approach in the task of cross-domain sentiment classification on 30 domain pairs.
arXiv Detail & Related papers (2020-10-22T06:51:24Z) - Self-training Improves Pre-training for Natural Language Understanding [63.78927366363178]
We study self-training as another way to leverage unlabeled data through semi-supervised learning.
We introduce SentAugment, a data augmentation method which computes task-specific query embeddings from labeled data.
Our approach leads to scalable and effective self-training with improvements of up to 2.6% on standard text classification benchmarks.
arXiv Detail & Related papers (2020-10-05T17:52:25Z) - A Review of Single-Source Deep Unsupervised Visual Domain Adaptation [81.07994783143533]
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks.
In many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data.
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
arXiv Detail & Related papers (2020-09-01T00:06:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.