Related papers: Dynamical Isometry based Rigorous Fair Neural Architecture Search

Dynamical Isometry based Rigorous Fair Neural Architecture Search

URL: http://arxiv.org/abs/2307.02263v2
Date: Thu, 6 Jul 2023 06:56:54 GMT
Title: Dynamical Isometry based Rigorous Fair Neural Architecture Search
Authors: Jianxiang Luo, Junyi Hu, Tianji Pang, Weihao Huang, Chuang Liu
Abstract summary: We propose a novel neural architecture search algorithm based on dynamical isometry. We prove that our module selection strategy is rigorous fair by estimating the generalization error of all modules with well-conditioned Jacobian.
Score: 2.7850218655824803
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recently, the weight-sharing technique has significantly speeded up the training and evaluation procedure of neural architecture search. However, most existing weight-sharing strategies are solely based on experience or observation, which makes the searching results lack interpretability and rationality. In addition, due to the negligence of fairness, current methods are prone to make misjudgments in module evaluation. To address these problems, we propose a novel neural architecture search algorithm based on dynamical isometry. We use the fix point analysis method in the mean field theory to analyze the dynamics behavior in the steady state random neural network, and how dynamic isometry guarantees the fairness of weight-sharing based NAS. Meanwhile, we prove that our module selection strategy is rigorous fair by estimating the generalization error of all modules with well-conditioned Jacobian. Extensive experiments show that, with the same size, the architecture searched by the proposed method can achieve state-of-the-art top-1 validation accuracy on ImageNet classification. In addition, we demonstrate that our method is able to achieve better and more stable training performance without loss of generality.

Related papers

Certified Neural Approximations of Nonlinear Dynamics [52.79163248326912]
In safety-critical contexts, the use of neural approximations requires formal bounds on their closeness to the underlying system.<n>We propose a novel, adaptive, and parallelizable verification method based on certified first-order models.
arXiv Detail & Related papers (2025-05-21T13:22:20Z)
Optimizing Deep Neural Networks using Safety-Guided Self Compression [0.0]
This study introduces a novel safety-driven quantization framework that prunes and quantizes neural network weights. The proposed methodology is rigorously evaluated on both a convolutional neural network (CNN) and an attention-based language model. Experimental results reveal that our framework achieves up to a 2.5% enhancement in test accuracy relative to the original unquantized models.
arXiv Detail & Related papers (2025-05-01T06:50:30Z)
Meta-Learning for Physically-Constrained Neural System Identification [9.417562391585076]
We present a gradient-based meta-learning framework for rapid adaptation of neural state-space models (NSSMs) for black-box system identification. We show that the meta-learned models result in improved downstream performance in model-based state estimation in indoor localization and energy systems.
arXiv Detail & Related papers (2025-01-10T18:46:28Z)
A Bayesian Unification of Self-Supervised Clustering and Energy-Based Models [11.007541337967027]
We perform a Bayesian analysis of state-of-the-art self-supervised learning objectives. We show that our objective function allows to outperform existing self-supervised learning strategies. We also demonstrate that GEDI can be integrated into a neuro-symbolic framework.
arXiv Detail & Related papers (2023-12-30T04:46:16Z)
A Bayesian Approach to Robust Inverse Reinforcement Learning [54.24816623644148]
We consider a Bayesian approach to offline model-based inverse reinforcement learning (IRL) The proposed framework differs from existing offline model-based IRL approaches by performing simultaneous estimation of the expert's reward function and subjective model of environment dynamics. Our analysis reveals a novel insight that the estimated policy exhibits robust performance when the expert is believed to have a highly accurate model of the environment.
arXiv Detail & Related papers (2023-09-15T17:37:09Z)
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking [54.89987482509155]
robustness of deep neural networks is usually lacking under adversarial examples, common corruptions, and distribution shifts. We establish a comprehensive benchmark robustness called textbfARES-Bench on the image classification task. By designing the training settings accordingly, we achieve the new state-of-the-art adversarial robustness.
arXiv Detail & Related papers (2023-02-28T04:26:20Z)
ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling. We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z)
Guaranteed Conservation of Momentum for Learning Particle-based Fluid Dynamics [96.9177297872723]
We present a novel method for guaranteeing linear momentum in learned physics simulations. We enforce conservation of momentum with a hard constraint, which we realize via antisymmetrical continuous convolutional layers. In combination, the proposed method allows us to increase the physical accuracy of the learned simulator substantially.
arXiv Detail & Related papers (2022-10-12T09:12:59Z)
Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks. We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Disentangling Neural Architectures and Weights: A Case Study in Supervised Classification [8.976788958300766]
This work investigates the problem of disentangling the role of the neural structure and its edge weights. We show that well-trained architectures may not need any link-specific fine-tuning of the weights. We use a novel and computationally efficient method that translates the hard architecture-search problem into a feasible optimization problem.
arXiv Detail & Related papers (2020-09-11T11:22:22Z)
Physics-based polynomial neural networks for one-shot learning of dynamical systems from one or a few samples [0.0]
The paper describes practical results on both a simple pendulum and one of the largest worldwide X-ray source. It is demonstrated in practice that the proposed approach allows recovering complex physics from noisy, limited, and partial observations.
arXiv Detail & Related papers (2020-05-24T09:27:10Z)
Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning [8.366415386275557]
Solution involves a reformation of the objective function for optimization in neural network models. We introduce a decentralized weighted aggregating scheme based on the performance of local workers. To validate the new method, we benchmark our schemes against several popular algorithms.
arXiv Detail & Related papers (2020-04-07T23:38:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.