Related papers: Dataless Model Selection with the Deep Frame Potential

Dataless Model Selection with the Deep Frame Potential

URL: http://arxiv.org/abs/2003.13866v1
Date: Mon, 30 Mar 2020 23:27:25 GMT
Title: Dataless Model Selection with the Deep Frame Potential
Authors: Calvin Murdock, Simon Lucey
Abstract summary: We quantify networks by their intrinsic capacity for unique and robust representations. We propose the deep frame potential: a measure of coherence that is approximately related to representation stability but has minimizers that depend only on network structure. We validate its use as a criterion for model selection and demonstrate correlation with generalization error on a variety of common residual and densely connected network architectures.
Score: 45.16941644841897
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Choosing a deep neural network architecture is a fundamental problem in applications that require balancing performance and parameter efficiency. Standard approaches rely on ad-hoc engineering or computationally expensive validation on a specific dataset. We instead attempt to quantify networks by their intrinsic capacity for unique and robust representations, enabling efficient architecture comparisons without requiring any data. Building upon theoretical connections between deep learning and sparse approximation, we propose the deep frame potential: a measure of coherence that is approximately related to representation stability but has minimizers that depend only on network structure. This provides a framework for jointly quantifying the contributions of architectural hyper-parameters such as depth, width, and skip connections. We validate its use as a criterion for model selection and demonstrate correlation with generalization error on a variety of common residual and densely connected network architectures.

Related papers

Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process. In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture. We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z)
Partially Stochastic Infinitely Deep Bayesian Neural Networks [0.0]
We present a novel family of architectures that integrates partiality into the framework of infinitely deep neural networks. We leverage the advantages of partiality in the infinite-depth limit which include the benefits of fullity. We present a variety of architectural configurations, offering flexibility in network design.
arXiv Detail & Related papers (2024-02-05T20:15:19Z)
Hysteretic Behavior Simulation Based on Pyramid Neural Network:Principle, Network Architecture, Case Study and Explanation [0.0]
A surrogate model based on neural networks shows significant potential in balancing efficiency and accuracy. Its serial information flow and prediction based on single-level features adversely affect the network performance. A weighted stacked pyramid neural network architecture is proposed herein.
arXiv Detail & Related papers (2022-04-29T16:42:00Z)
Reframing Neural Networks: Deep Structure in Overcomplete Representations [41.84502123663809]
We introduce deep frame approximation, a unifying framework for representation learning with structured overcomplete frames. We quantify structural differences with the deep frame potential, a data-independent measure of coherence linked to representation uniqueness and stability. This connection to the established theory of overcomplete representations suggests promising new directions for principled deep network architecture design.
arXiv Detail & Related papers (2021-03-10T01:15:14Z)
Disentangling Neural Architectures and Weights: A Case Study in Supervised Classification [8.976788958300766]
This work investigates the problem of disentangling the role of the neural structure and its edge weights. We show that well-trained architectures may not need any link-specific fine-tuning of the weights. We use a novel and computationally efficient method that translates the hard architecture-search problem into a feasible optimization problem.
arXiv Detail & Related papers (2020-09-11T11:22:22Z)
Adversarially Robust Neural Architectures [43.74185132684662]
This paper aims to improve the adversarial robustness of the network from the architecture perspective with NAS framework. We explore the relationship among adversarial robustness, Lipschitz constant, and architecture parameters. Our algorithm empirically achieves the best performance among all the models under various attacks on different datasets.
arXiv Detail & Related papers (2020-09-02T08:52:15Z)
Automated Search for Resource-Efficient Branched Multi-Task Networks [81.48051635183916]
We propose a principled approach, rooted in differentiable neural architecture search, to automatically define branching structures in a multi-task neural network. We show that our approach consistently finds high-performing branching structures within limited resource budgets.
arXiv Detail & Related papers (2020-08-24T09:49:19Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks. We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance. Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.