Weak Adaptation Learning -- Addressing Cross-domain Data Insufficiency
with Weak Annotator
- URL: http://arxiv.org/abs/2102.07358v1
- Date: Mon, 15 Feb 2021 06:19:25 GMT
- Title: Weak Adaptation Learning -- Addressing Cross-domain Data Insufficiency
with Weak Annotator
- Authors: Shichao Xu, Lixu Wang, Yixuan Wang, Qi Zhu
- Abstract summary: In some target problem domains, there are not many data samples available, which could hinder the learning process.
We propose a weak adaptation learning (WAL) approach that leverages unlabeled data from a similar source domain.
Our experiments demonstrate the effectiveness of our approach in learning an accurate classifier with limited labeled data in the target domain.
- Score: 2.8672054847109134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data quantity and quality are crucial factors for data-driven learning
methods. In some target problem domains, there are not many data samples
available, which could significantly hinder the learning process. While data
from similar domains may be leveraged to help through domain adaptation,
obtaining high-quality labeled data for those source domains themselves could
be difficult or costly. To address such challenges on data insufficiency for
classification problem in a target domain, we propose a weak adaptation
learning (WAL) approach that leverages unlabeled data from a similar source
domain, a low-cost weak annotator that produces labels based on task-specific
heuristics, labeling rules, or other methods (albeit with inaccuracy), and a
small amount of labeled data in the target domain. Our approach first conducts
a theoretical analysis on the error bound of the trained classifier with
respect to the data quantity and the performance of the weak annotator, and
then introduces a multi-stage weak adaptation learning method to learn an
accurate classifier by lowering the error bound. Our experiments demonstrate
the effectiveness of our approach in learning an accurate classifier with
limited labeled data in the target domain and unlabeled data in the source
domain.
Related papers
- Data Valuation with Gradient Similarity [1.997283751398032]
Data Valuation algorithms quantify the value of each sample in a dataset based on its contribution or importance to a given predictive task.
We present a simple alternative to existing methods, termed Data Valuation with Gradient Similarity (DVGS)
Our approach has the ability to rapidly and accurately identify low-quality data, which can reduce the need for expert knowledge and manual intervention in data cleaning tasks.
arXiv Detail & Related papers (2024-05-13T22:10:00Z) - Calibrated Adaptive Teacher for Domain Adaptive Intelligent Fault
Diagnosis [7.88657961743755]
Unsupervised domain adaptation (UDA) deals with the scenario where labeled data are available in a source domain, and only unlabeled data are available in a target domain.
We propose a novel UDA method called Calibrated Adaptive Teacher (CAT), where we propose to calibrate the predictions of the teacher network throughout the self-training process.
arXiv Detail & Related papers (2023-12-05T15:19:29Z) - Building Manufacturing Deep Learning Models with Minimal and Imbalanced
Training Data Using Domain Adaptation and Data Augmentation [15.333573151694576]
We propose a novel domain adaptation (DA) approach to address the problem of labeled training data scarcity for a target learning task.
Our approach works for scenarios where the source dataset and the dataset available for the target learning task have same or different feature spaces.
We evaluate our combined approach using image data for wafer defect prediction.
arXiv Detail & Related papers (2023-05-31T21:45:34Z) - Cross-domain Transfer of defect features in technical domains based on
partial target data [0.0]
In many technical domains, it is only the defect or worn reject classes that are insufficiently represented.
The proposed classification approach addresses such conditions and is based on a CNN encoder.
It is benchmarked in a technical and a non-technical domain and shows convincing classification results.
arXiv Detail & Related papers (2022-11-24T15:23:58Z) - Domain Adaptation Principal Component Analysis: base linear method for
learning with out-of-distribution data [55.41644538483948]
Domain adaptation is a popular paradigm in modern machine learning.
We present a method called Domain Adaptation Principal Component Analysis (DAPCA)
DAPCA finds a linear reduced data representation useful for solving the domain adaptation task.
arXiv Detail & Related papers (2022-08-28T21:10:56Z) - Source-Free Domain Adaptation via Distribution Estimation [106.48277721860036]
Domain Adaptation aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain whose data distributions are different.
Recently, Source-Free Domain Adaptation (SFDA) has drawn much attention, which tries to tackle domain adaptation problem without using source data.
In this work, we propose a novel framework called SFDA-DE to address SFDA task via source Distribution Estimation.
arXiv Detail & Related papers (2022-04-24T12:22:19Z) - Domain Adaptive Semantic Segmentation without Source Data [50.18389578589789]
We investigate domain adaptive semantic segmentation without source data, which assumes that the model is pre-trained on the source domain.
We propose an effective framework for this challenging problem with two components: positive learning and negative learning.
Our framework can be easily implemented and incorporated with other methods to further enhance the performance.
arXiv Detail & Related papers (2021-10-13T04:12:27Z) - Self-Supervised Noisy Label Learning for Source-Free Unsupervised Domain
Adaptation [87.60688582088194]
We propose a novel Self-Supervised Noisy Label Learning method.
Our method can easily achieve state-of-the-art results and surpass other methods by a very large margin.
arXiv Detail & Related papers (2021-02-23T10:51:45Z) - Selective Pseudo-Labeling with Reinforcement Learning for
Semi-Supervised Domain Adaptation [116.48885692054724]
We propose a reinforcement learning based selective pseudo-labeling method for semi-supervised domain adaptation.
We develop a deep Q-learning model to select both accurate and representative pseudo-labeled instances.
Our proposed method is evaluated on several benchmark datasets for SSDA, and demonstrates superior performance to all the comparison methods.
arXiv Detail & Related papers (2020-12-07T03:37:38Z) - A Review of Single-Source Deep Unsupervised Visual Domain Adaptation [81.07994783143533]
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks.
In many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data.
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
arXiv Detail & Related papers (2020-09-01T00:06:50Z) - Improving Adversarial Robustness via Unlabeled Out-of-Domain Data [30.58040078862511]
We investigate how adversarial robustness can be enhanced by leveraging out-of-domain unlabeled data.
We show settings where we achieve better adversarial robustness when the unlabeled data come from a shifted domain rather than the same domain as the labeled data.
arXiv Detail & Related papers (2020-06-15T15:25:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.