Confidence-Guided Data Augmentation for Deep Semi-Supervised Training
- URL: http://arxiv.org/abs/2209.08174v1
- Date: Fri, 16 Sep 2022 21:23:19 GMT
- Title: Confidence-Guided Data Augmentation for Deep Semi-Supervised Training
- Authors: Fadoua Khmaissia and Hichem Frigui
- Abstract summary: We propose a new data augmentation technique for semi-supervised learning settings that emphasizes learning from the most challenging regions of the feature space.
We perform experiments on two benchmark RGB datasets: CIFAR-100 and STL-10, and show that the proposed scheme improves classification performance in terms of accuracy and robustness.
- Score: 0.9968241071319184
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We propose a new data augmentation technique for semi-supervised learning
settings that emphasizes learning from the most challenging regions of the
feature space. Starting with a fully supervised reference model, we first
identify low confidence predictions. These samples are then used to train a
Variational AutoEncoder (VAE) that can generate an infinite number of
additional images with similar distribution. Finally, using the originally
labeled data and the synthetically generated labeled and unlabeled data, we
retrain a new model in a semi-supervised fashion. We perform experiments on two
benchmark RGB datasets: CIFAR-100 and STL-10, and show that the proposed scheme
improves classification performance in terms of accuracy and robustness, while
yielding comparable or superior results with respect to existing fully
supervised approaches
Related papers
- Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification [49.09505771145326]
We propose a Hierarchical Dynamic Labeling (HDL) algorithm that does not depend on model predictions and utilizes image embeddings to generate sample labels.
Our approach has the potential to change the paradigm of pseudo-label generation in semi-supervised learning.
arXiv Detail & Related papers (2024-04-26T06:00:27Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Spanning Training Progress: Temporal Dual-Depth Scoring (TDDS) for Enhanced Dataset Pruning [50.809769498312434]
We propose a novel dataset pruning method termed as Temporal Dual-Depth Scoring (TDDS)
Our method achieves 54.51% accuracy with only 10% training data, surpassing random selection by 7.83% and other comparison methods by at least 12.69%.
arXiv Detail & Related papers (2023-11-22T03:45:30Z) - Diverse Data Augmentation with Diffusions for Effective Test-time Prompt
Tuning [73.75282761503581]
We propose DiffTPT, which leverages pre-trained diffusion models to generate diverse and informative new data.
Our experiments on test datasets with distribution shifts and unseen categories demonstrate that DiffTPT improves the zero-shot accuracy by an average of 5.13%.
arXiv Detail & Related papers (2023-08-11T09:36:31Z) - Convolutional Neural Networks for the classification of glitches in
gravitational-wave data streams [52.77024349608834]
We classify transient noise signals (i.e.glitches) and gravitational waves in data from the Advanced LIGO detectors.
We use models with a supervised learning approach, both trained from scratch using the Gravity Spy dataset.
We also explore a self-supervised approach, pre-training models with automatically generated pseudo-labels.
arXiv Detail & Related papers (2023-03-24T11:12:37Z) - Self-Supervised Pre-Training for Transformer-Based Person
Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID)
Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance.
This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z) - DATE: Detecting Anomalies in Text via Self-Supervision of Transformers [5.105840060102528]
Recent deep methods for anomalies in images learn better features of normality in an end-to-end self-supervised setting.
We use this approach for Anomaly Detection in text, by introducing a novel pretext task on text sequences.
We show strong quantitative and qualitative results on the 20Newsgroups and AG News datasets.
arXiv Detail & Related papers (2021-04-12T16:08:05Z) - Adaptive Consistency Regularization for Semi-Supervised Transfer
Learning [31.66745229673066]
We consider semi-supervised learning and transfer learning jointly, leading to a more practical and competitive paradigm.
To better exploit the value of both pre-trained weights and unlabeled target examples, we introduce adaptive consistency regularization.
Our proposed adaptive consistency regularization outperforms state-of-the-art semi-supervised learning techniques such as Pseudo Label, Mean Teacher, and MixMatch.
arXiv Detail & Related papers (2021-03-03T05:46:39Z) - Robust Disentanglement of a Few Factors at a Time [5.156484100374058]
We introduce population-based training (PBT) for improving consistency in training variational autoencoders (VAEs)
We then use Unsupervised Disentanglement Ranking (UDR) as an unsupervised to score models in our PBT-VAE training and show how models trained this way tend to consistently disentangle only a subset of the generative factors.
We show striking improvement in state-of-the-art unsupervised disentanglement performance and robustness across multiple datasets and metrics.
arXiv Detail & Related papers (2020-10-26T12:34:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.