Related papers: Adaptive Neural Networks Using Residual Fitting

Adaptive Neural Networks Using Residual Fitting

URL: http://arxiv.org/abs/2301.05744v1
Date: Fri, 13 Jan 2023 19:52:30 GMT
Title: Adaptive Neural Networks Using Residual Fitting
Authors: Noah Ford, John Winder, Josh McClellan
Abstract summary: We present a network-growth method that searches for explainable error in the network's residuals and grows the network if sufficient error is detected. Within these tasks, the growing network can often achieve better performance than small networks that do not grow.
Score: 2.546014024559691
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current methods for estimating the required neural-network size for a given problem class have focused on methods that can be computationally intensive, such as neural-architecture search and pruning. In contrast, methods that add capacity to neural networks as needed may provide similar results to architecture search and pruning, but do not require as much computation to find an appropriate network size. Here, we present a network-growth method that searches for explainable error in the network's residuals and grows the network if sufficient error is detected. We demonstrate this method using examples from classification, imitation learning, and reinforcement learning. Within these tasks, the growing network can often achieve better performance than small networks that do not grow, and similar performance to networks that begin much larger.

Related papers

Learning Morphisms with Gauss-Newton Approximation for Growing Networks [43.998746572276076]
A popular method for Neural Architecture Search (NAS) is based on growing networks via small local changes to the network's architecture called network morphisms. Here we propose a NAS method for growing a network by using a Gauss-Newton approximation of the loss function to efficiently learn and evaluate candidate network morphisms.
arXiv Detail & Related papers (2024-11-07T01:12:42Z)
Efficient and Flexible Method for Reducing Moderate-size Deep Neural Networks with Condensation [36.41451383422967]
In scientific applications, the scale of neural networks is generally moderate-size, mainly to ensure the speed of inference. Existing work has found that the powerful capabilities of neural networks are primarily due to their non-linearity. We propose a condensation reduction algorithm to verify the feasibility of this idea in practical problems.
arXiv Detail & Related papers (2024-05-02T06:53:40Z)
Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z)
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks. This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z)
Dimensionality Reduction in Deep Learning via Kronecker Multi-layer Architectures [4.836352379142503]
We propose a new deep learning architecture based on fast matrix multiplication of a Kronecker product decomposition. We show that this architecture allows a neural network to be trained and implemented with a significant reduction in computational time and resources.
arXiv Detail & Related papers (2022-04-08T19:54:52Z)
Quasi-orthogonality and intrinsic dimensions as measures of learning and generalisation [55.80128181112308]
We show that dimensionality and quasi-orthogonality of neural networks' feature space may jointly serve as network's performance discriminants. Our findings suggest important relationships between the networks' final performance and properties of their randomly initialised feature spaces.
arXiv Detail & Related papers (2022-03-30T21:47:32Z)
Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks [50.684661759340145]
Firefly neural architecture descent is a general framework for progressively and dynamically growing neural networks. We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures. In particular, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.
arXiv Detail & Related papers (2021-02-17T04:47:18Z)
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks [78.47459801017959]
Sparsity can reduce the memory footprint of regular networks to fit mobile devices. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice.
arXiv Detail & Related papers (2021-01-31T22:48:50Z)
The Connection Between Approximation, Depth Separation and Learnability in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity. We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z)
It's Hard for Neural Networks To Learn the Game of Life [4.061135251278187]
Recent findings suggest that neural networks rely on lucky random initial weights of "lottery tickets" that converge quickly to a solution. We examine small convolutional networks that are trained to predict n steps of the two-dimensional cellular automaton Conway's Game of Life. We find that networks of this architecture trained on this task rarely converge.
arXiv Detail & Related papers (2020-09-03T00:47:08Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
Differentiable Sparsification for Deep Neural Networks [0.0]
We propose a fully differentiable sparsification method for deep neural networks. The proposed method can learn both the sparsified structure and weights of a network in an end-to-end manner. To the best of our knowledge, this is the first fully differentiable sparsification method.
arXiv Detail & Related papers (2019-10-08T03:57:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.