Enabling On-Device Self-Supervised Contrastive Learning With Selective
Data Contrast
- URL: http://arxiv.org/abs/2106.03796v1
- Date: Mon, 7 Jun 2021 17:04:56 GMT
- Title: Enabling On-Device Self-Supervised Contrastive Learning With Selective
Data Contrast
- Authors: Yawen Wu, Zhepeng Wang, Dewen Zeng, Yiyu Shi, Jingtong Hu
- Abstract summary: We propose a framework to automatically select the most representative data from the unlabeled input stream.
Experiments show that accuracy and learning speed are greatly improved.
- Score: 13.563747709789387
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: After a model is deployed on edge devices, it is desirable for these devices
to learn from unlabeled data to continuously improve accuracy. Contrastive
learning has demonstrated its great potential in learning from unlabeled data.
However, the online input data are usually none independent and identically
distributed (non-iid) and storages of edge devices are usually too limited to
store enough representative data from different data classes. We propose a
framework to automatically select the most representative data from the
unlabeled input stream, which only requires a small data buffer for dynamic
learning. Experiments show that accuracy and learning speed are greatly
improved.
Related papers
- Enabling On-Device Learning via Experience Replay with Efficient Dataset Condensation [15.915388740468815]
We propose an on-device framework that addresses the issue of identifying the most representative data to avoid significant information loss.
Specifically, to effectively handle unlabeled incoming data, we propose a pseudo-labeling technique designed for unlabeled on-device learning environments.
With a buffer capacity of just one sample per class, our method achieves an accuracy that outperforms the best existing baseline by 58.4%.
arXiv Detail & Related papers (2024-05-25T07:52:36Z) - Ambiguous Annotations: When is a Pedestrian not a Pedestrian? [6.974741712647656]
It is not always possible to objectively determine whether an assigned label is correct or not.
Our experiments show that excluding highly ambiguous data from the training improves model performance.
In order to safely remove ambiguous instances and ensure the retained representativeness of the training data, an understanding of the properties of the dataset and class under investigation is crucial.
arXiv Detail & Related papers (2024-05-14T17:44:34Z) - FlatMatch: Bridging Labeled Data and Unlabeled Data with Cross-Sharpness
for Semi-Supervised Learning [73.13448439554497]
Semi-Supervised Learning (SSL) has been an effective way to leverage abundant unlabeled data with extremely scarce labeled data.
Most SSL methods are commonly based on instance-wise consistency between different data transformations.
We propose FlatMatch which minimizes a cross-sharpness measure to ensure consistent learning performance between the two datasets.
arXiv Detail & Related papers (2023-10-25T06:57:59Z) - Self-supervised On-device Federated Learning from Unlabeled Streams [15.94978097767473]
We propose a Self-supervised On-device Federated learning framework with coreset selection, which we call SOFed, to automatically select a coreset.
Experiments demonstrate the effectiveness and significance of the proposed method in visual representation learning.
arXiv Detail & Related papers (2022-12-02T07:22:00Z) - Understanding Memorization from the Perspective of Optimization via
Efficient Influence Estimation [54.899751055620904]
We study the phenomenon of memorization with turn-over dropout, an efficient method to estimate influence and memorization, for data with true labels (real data) and data with random labels (random data)
Our main findings are: (i) For both real data and random data, the optimization of easy examples (e.g., real data) and difficult examples (e.g., random data) are conducted by the network simultaneously, with easy ones at a higher speed; (ii) For real data, a correct difficult example in the training dataset is more informative than an easy one.
arXiv Detail & Related papers (2021-12-16T11:34:23Z) - Self-Tuning for Data-Efficient Deep Learning [75.34320911480008]
Self-Tuning is a novel approach to enable data-efficient deep learning.
It unifies the exploration of labeled and unlabeled data and the transfer of a pre-trained model.
It outperforms its SSL and TL counterparts on five tasks by sharp margins.
arXiv Detail & Related papers (2021-02-25T14:56:19Z) - ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for
Semi-supervised Continual Learning [52.831894583501395]
Continual learning assumes the incoming data are fully labeled, which might not be applicable in real applications.
We propose deep Online Replay with Discriminator Consistency (ORDisCo) to interdependently learn a classifier with a conditional generative adversarial network (GAN)
We show ORDisCo achieves significant performance improvement on various semi-supervised learning benchmark datasets for SSCL.
arXiv Detail & Related papers (2021-01-02T09:04:14Z) - Federated Self-Supervised Learning of Multi-Sensor Representations for
Embedded Intelligence [8.110949636804772]
Smartphones, wearables, and Internet of Things (IoT) devices produce a wealth of data that cannot be accumulated in a centralized repository for learning supervised models.
We propose a self-supervised approach termed textitscalogram-signal correspondence learning based on wavelet transform to learn useful representations from unlabeled sensor inputs.
We extensively assess the quality of learned features with our multi-view strategy on diverse public datasets, achieving strong performance in all domains.
arXiv Detail & Related papers (2020-07-25T21:59:17Z) - Don't Wait, Just Weight: Improving Unsupervised Representations by
Learning Goal-Driven Instance Weights [92.16372657233394]
Self-supervised learning techniques can boost performance by learning useful representations from unlabelled data.
We show that by learning Bayesian instance weights for the unlabelled data, we can improve the downstream classification accuracy.
Our method, BetaDataWeighter is evaluated using the popular self-supervised rotation prediction task on STL-10 and Visual Decathlon.
arXiv Detail & Related papers (2020-06-22T15:59:32Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z) - Federated Visual Classification with Real-World Data Distribution [9.564468846277366]
We characterize the effect real-world data distributions have on distributed learning, using as a benchmark the standard Federated Averaging (FedAvg) algorithm.
We introduce two new large-scale datasets for species and landmark classification, with realistic per-user data splits.
We also develop two new algorithms (FedVC, FedIR) that intelligently resample and reweight over the client pool, bringing large improvements in accuracy and stability in training.
arXiv Detail & Related papers (2020-03-18T07:55:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.