Enabling the Network to Surf the Internet
- URL: http://arxiv.org/abs/2102.12205v1
- Date: Wed, 24 Feb 2021 11:00:29 GMT
- Title: Enabling the Network to Surf the Internet
- Authors: Zhuoling Li, Haohan Wang, Tymoteusz Swistek, Weixin Chen, Yuanzheng
Li, Haoqian Wang
- Abstract summary: We develop a framework that enables the model to surf the Internet.
We observe that the generalization ability of the learned representation is crucial for self-supervised learning.
We demonstrate the superiority of the proposed framework with experiments on miniImageNet, tieredImageNet and Omniglot.
- Score: 13.26679087834881
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot learning is challenging due to the limited data and labels. Existing
algorithms usually resolve this problem by pre-training the model with a
considerable amount of annotated data which shares knowledge with the target
domain. Nevertheless, large quantities of homogenous data samples are not
always available. To tackle this issue, we develop a framework that enables the
model to surf the Internet, which implies that the model can collect and
annotate data without manual effort. Since the online data is virtually
limitless and continues to be generated, the model can thus be empowered to
constantly obtain up-to-date knowledge from the Internet. Additionally, we
observe that the generalization ability of the learned representation is
crucial for self-supervised learning. To present its importance, a naive yet
efficient normalization strategy is proposed. Consequentially, this strategy
boosts the accuracy of the model significantly (20.46% at most). We demonstrate
the superiority of the proposed framework with experiments on miniImageNet,
tieredImageNet and Omniglot. The results indicate that our method has surpassed
previous unsupervised counterparts by a large margin (more than 10%) and
obtained performance comparable with the supervised ones.
Related papers
- Malicious Internet Entity Detection Using Local Graph Inference [0.4893345190925178]
Detection of malicious behavior in a large network is a challenging problem for machine learning in computer security.
Current cybersec-tailored approaches are still limited in expressivity, and methods successful in other domains do not scale well for large volumes of data.
This work proposes a new perspective for learning from graph data that is modeling network entity interactions as a large heterogeneous graph.
arXiv Detail & Related papers (2024-08-06T16:35:25Z) - SeiT++: Masked Token Modeling Improves Storage-efficient Training [36.95646819348317]
Recent advancements in Deep Neural Network (DNN) models have significantly improved performance across computer vision tasks.
achieving highly generalizable and high-performing vision models requires expansive datasets, resulting in significant storage requirements.
Recent breakthrough by SeiT proposed the use of Vector-Quantized (VQ) feature vectors (i.e., tokens) as network inputs for vision classification.
In this paper, we extend SeiT by integrating Masked Token Modeling (MTM) for self-supervised pre-training.
arXiv Detail & Related papers (2023-12-15T04:11:34Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - A Simple and Efficient Baseline for Data Attribution on Images [107.12337511216228]
Current state-of-the-art approaches require a large ensemble of as many as 300,000 models to accurately attribute model predictions.
In this work, we focus on a minimalist baseline, utilizing the feature space of a backbone pretrained via self-supervised learning to perform data attribution.
Our method is model-agnostic and scales easily to large datasets.
arXiv Detail & Related papers (2023-11-03T17:29:46Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Efficiently Robustify Pre-trained Models [18.392732966487582]
robustness of large scale models towards real-world settings is still a less-explored topic.
We first benchmark the performance of these models under different perturbations and datasets.
We then discuss on how complete model fine-tuning based existing robustification schemes might not be a scalable option given very large scale networks.
arXiv Detail & Related papers (2023-09-14T08:07:49Z) - Task-Agnostic Robust Representation Learning [31.818269301504564]
We study the problem of robust representation learning with unlabeled data in a task-agnostic manner.
We derive an upper bound on the adversarial loss of a prediction model on any downstream task, using its loss on the clean data and a robustness regularizer.
Our method achieves preferable adversarial performance compared to relevant baselines.
arXiv Detail & Related papers (2022-03-15T02:05:11Z) - To be Critical: Self-Calibrated Weakly Supervised Learning for Salient
Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations.
We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions.
We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z) - Self-Damaging Contrastive Learning [92.34124578823977]
Unlabeled data in reality is commonly imbalanced and shows a long-tail distribution.
This paper proposes a principled framework called Self-Damaging Contrastive Learning to automatically balance the representation learning without knowing the classes.
Our experiments show that SDCLR significantly improves not only overall accuracies but also balancedness.
arXiv Detail & Related papers (2021-06-06T00:04:49Z) - Distill on the Go: Online knowledge distillation in self-supervised
learning [1.1470070927586016]
Recent works have shown that wider and deeper models benefit more from self-supervised learning than smaller models.
We propose Distill-on-the-Go (DoGo), a self-supervised learning paradigm using single-stage online knowledge distillation.
Our results show significant performance gain in the presence of noisy and limited labels.
arXiv Detail & Related papers (2021-04-20T09:59:23Z) - Mixed-Privacy Forgetting in Deep Networks [114.3840147070712]
We show that the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks.
Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting.
We show that our method allows forgetting without having to trade off the model accuracy.
arXiv Detail & Related papers (2020-12-24T19:34:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.