Dynamical Isometry based Rigorous Fair Neural Architecture Search
- URL: http://arxiv.org/abs/2307.02263v2
- Date: Thu, 6 Jul 2023 06:56:54 GMT
- Title: Dynamical Isometry based Rigorous Fair Neural Architecture Search
- Authors: Jianxiang Luo, Junyi Hu, Tianji Pang, Weihao Huang, Chuang Liu
- Abstract summary: We propose a novel neural architecture search algorithm based on dynamical isometry.
We prove that our module selection strategy is rigorous fair by estimating the generalization error of all modules with well-conditioned Jacobian.
- Score: 2.7850218655824803
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recently, the weight-sharing technique has significantly speeded up the
training and evaluation procedure of neural architecture search. However, most
existing weight-sharing strategies are solely based on experience or
observation, which makes the searching results lack interpretability and
rationality. In addition, due to the negligence of fairness, current methods
are prone to make misjudgments in module evaluation. To address these problems,
we propose a novel neural architecture search algorithm based on dynamical
isometry. We use the fix point analysis method in the mean field theory to
analyze the dynamics behavior in the steady state random neural network, and
how dynamic isometry guarantees the fairness of weight-sharing based NAS.
Meanwhile, we prove that our module selection strategy is rigorous fair by
estimating the generalization error of all modules with well-conditioned
Jacobian. Extensive experiments show that, with the same size, the architecture
searched by the proposed method can achieve state-of-the-art top-1 validation
accuracy on ImageNet classification. In addition, we demonstrate that our
method is able to achieve better and more stable training performance without
loss of generality.
Related papers
- A Bayesian Unification of Self-Supervised Clustering and Energy-Based
Models [11.007541337967027]
We perform a Bayesian analysis of state-of-the-art self-supervised learning objectives.
We show that our objective function allows to outperform existing self-supervised learning strategies.
We also demonstrate that GEDI can be integrated into a neuro-symbolic framework.
arXiv Detail & Related papers (2023-12-30T04:46:16Z) - A Bayesian Approach to Robust Inverse Reinforcement Learning [54.24816623644148]
We consider a Bayesian approach to offline model-based inverse reinforcement learning (IRL)
The proposed framework differs from existing offline model-based IRL approaches by performing simultaneous estimation of the expert's reward function and subjective model of environment dynamics.
Our analysis reveals a novel insight that the estimated policy exhibits robust performance when the expert is believed to have a highly accurate model of the environment.
arXiv Detail & Related papers (2023-09-15T17:37:09Z) - A Comprehensive Study on Robustness of Image Classification Models:
Benchmarking and Rethinking [54.89987482509155]
robustness of deep neural networks is usually lacking under adversarial examples, common corruptions, and distribution shifts.
We establish a comprehensive benchmark robustness called textbfARES-Bench on the image classification task.
By designing the training settings accordingly, we achieve the new state-of-the-art adversarial robustness.
arXiv Detail & Related papers (2023-02-28T04:26:20Z) - ConCerNet: A Contrastive Learning Based Framework for Automated
Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling.
We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z) - Guaranteed Conservation of Momentum for Learning Particle-based Fluid
Dynamics [96.9177297872723]
We present a novel method for guaranteeing linear momentum in learned physics simulations.
We enforce conservation of momentum with a hard constraint, which we realize via antisymmetrical continuous convolutional layers.
In combination, the proposed method allows us to increase the physical accuracy of the learned simulator substantially.
arXiv Detail & Related papers (2022-10-12T09:12:59Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Disentangling Neural Architectures and Weights: A Case Study in
Supervised Classification [8.976788958300766]
This work investigates the problem of disentangling the role of the neural structure and its edge weights.
We show that well-trained architectures may not need any link-specific fine-tuning of the weights.
We use a novel and computationally efficient method that translates the hard architecture-search problem into a feasible optimization problem.
arXiv Detail & Related papers (2020-09-11T11:22:22Z) - Physics-based polynomial neural networks for one-shot learning of
dynamical systems from one or a few samples [0.0]
The paper describes practical results on both a simple pendulum and one of the largest worldwide X-ray source.
It is demonstrated in practice that the proposed approach allows recovering complex physics from noisy, limited, and partial observations.
arXiv Detail & Related papers (2020-05-24T09:27:10Z) - Weighted Aggregating Stochastic Gradient Descent for Parallel Deep
Learning [8.366415386275557]
Solution involves a reformation of the objective function for optimization in neural network models.
We introduce a decentralized weighted aggregating scheme based on the performance of local workers.
To validate the new method, we benchmark our schemes against several popular algorithms.
arXiv Detail & Related papers (2020-04-07T23:38:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.