DCLP: Neural Architecture Predictor with Curriculum Contrastive Learning
- URL: http://arxiv.org/abs/2302.13020v2
- Date: Thu, 14 Dec 2023 06:26:18 GMT
- Title: DCLP: Neural Architecture Predictor with Curriculum Contrastive Learning
- Authors: Shenghe Zheng, Hongzhi Wang, Tianyu Mu
- Abstract summary: We propose a Curricumum-guided Contrastive Learning framework for neural Predictor (DCLP)
Our method simplifies the contrastive task by designing a novel curriculum to enhance the stability of unlabeled training data distribution.
We experimentally demonstrate that DCLP has high accuracy and efficiency compared with existing predictors.
- Score: 5.2319020651074215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural predictors have shown great potential in the evaluation process of
neural architecture search (NAS). However, current predictor-based approaches
overlook the fact that training a predictor necessitates a considerable number
of trained neural networks as the labeled training set, which is costly to
obtain. Therefore, the critical issue in utilizing predictors for NAS is to
train a high-performance predictor using as few trained neural networks as
possible. Although some methods attempt to address this problem through
unsupervised learning, they often result in inaccurate predictions. We argue
that the unsupervised tasks intended for the common graph data are too
challenging for neural networks, causing unsupervised training to be
susceptible to performance crashes in NAS. To address this issue, we propose a
Curricumum-guided Contrastive Learning framework for neural Predictor (DCLP).
Our method simplifies the contrastive task by designing a novel curriculum to
enhance the stability of unlabeled training data distribution during
contrastive training. Specifically, we propose a scheduler that ranks the
training data according to the contrastive difficulty of each data and then
inputs them to the contrastive learner in order. This approach concentrates the
training data distribution and makes contrastive training more efficient. By
using our method, the contrastive learner incrementally learns feature
representations via unsupervised data on a smooth learning curve, avoiding
performance crashes that may occur with excessively variable training data
distributions. We experimentally demonstrate that DCLP has high accuracy and
efficiency compared with existing predictors, and shows promising potential to
discover superior architectures in various search spaces when combined with
search strategies. Our code is available at:
https://github.com/Zhengsh123/DCLP.
Related papers
- Training Better Deep Learning Models Using Human Saliency [11.295653130022156]
This work explores how human judgement about salient regions of an image can be introduced into deep convolutional neural network (DCNN) training.
We propose a new component of the loss function that ConveYs Brain Oversight to Raise Generalization (CYBORG) and penalizes the model for using non-salient regions.
arXiv Detail & Related papers (2024-10-21T16:52:44Z) - Adversarial training with informed data selection [53.19381941131439]
Adrial training is the most efficient solution to defend the network against these malicious attacks.
This work proposes a data selection strategy to be applied in the mini-batch training.
The simulation results show that a good compromise can be obtained regarding robustness and standard accuracy.
arXiv Detail & Related papers (2023-01-07T12:09:50Z) - Boosted Dynamic Neural Networks [53.559833501288146]
A typical EDNN has multiple prediction heads at different layers of the network backbone.
To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data.
Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions.
We formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively.
arXiv Detail & Related papers (2022-11-30T04:23:12Z) - Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem.
Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z) - Efficient Augmentation for Imbalanced Deep Learning [8.38844520504124]
We study a convolutional neural network's internal representation of imbalanced image data.
We measure the generalization gap between a model's feature embeddings in the training and test sets, showing that the gap is wider for minority classes.
This insight enables us to design an efficient three-phase CNN training framework for imbalanced data.
arXiv Detail & Related papers (2022-07-13T09:43:17Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Cascade Bagging for Accuracy Prediction with Few Training Samples [8.373420721376739]
We propose a novel framework to train an accuracy predictor under few training samples.
The framework consists ofdata augmentation methods and an ensemble learning algorithm.
arXiv Detail & Related papers (2021-08-12T09:10:52Z) - Self-Adaptive Training: Bridging the Supervised and Self-Supervised
Learning [16.765461276790944]
Self-adaptive training is a unified training algorithm that dynamically calibrates and enhances training process by model predictions without incurring extra computational cost.
We analyze the training dynamics of deep networks on training data corrupted by, e.g., random noise and adversarial examples.
Our analysis shows that model predictions are able to magnify useful underlying information in data and this phenomenon occurs broadly even in the absence of emphany label information.
arXiv Detail & Related papers (2021-01-21T17:17:30Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.