Neural Subnetwork Ensembles
- URL: http://arxiv.org/abs/2311.14101v2
- Date: Fri, 5 Jul 2024 18:29:34 GMT
- Title: Neural Subnetwork Ensembles
- Authors: Tim Whitaker,
- Abstract summary: This dissertation introduces and formalizes a low-cost framework for constructing Subnetwork Ensembles.
Child networks are formed by sampling, perturbing, and optimizingworks from a trained parent model.
Our findings reveal that this approach can greatly improve training efficiency, parametric utilization, and generalization performance.
- Score: 2.44755919161855
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural network ensembles have been effectively used to improve generalization by combining the predictions of multiple independently trained models. However, the growing scale and complexity of deep neural networks have led to these methods becoming prohibitively expensive and time consuming to implement. Low-cost ensemble methods have become increasingly important as they can alleviate the need to train multiple models from scratch while retaining the generalization benefits that traditional ensemble learning methods afford. This dissertation introduces and formalizes a low-cost framework for constructing Subnetwork Ensembles, where a collection of child networks are formed by sampling, perturbing, and optimizing subnetworks from a trained parent model. We explore several distinct methodologies for generating child networks and we evaluate their efficacy through a variety of ablation studies and established benchmarks. Our findings reveal that this approach can greatly improve training efficiency, parametric utilization, and generalization performance while minimizing computational cost. Subnetwork Ensembles offer a compelling framework for exploring how we can build better systems by leveraging the unrealized potential of deep neural networks.
Related papers
- Generalization and Estimation Error Bounds for Model-based Neural
Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks.
We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z) - Sparse Interaction Additive Networks via Feature Interaction Detection
and Sparse Selection [10.191597755296163]
We develop a tractable selection algorithm to efficiently identify the necessary feature combinations.
Our proposed Sparse Interaction Additive Networks (SIAN) construct a bridge from simple and interpretable models to fully connected neural networks.
arXiv Detail & Related papers (2022-09-19T19:57:17Z) - The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by
Isolating Task-Specific Subnetworks in Feedforward Neural Networks [0.0]
We identify a methodology and network representational structure which allows a pruned network to employ previously unused weights to learn subsequent tasks.
We show that networks trained using our approaches are able to learn multiple tasks, which may be related or unrelated, in parallel or in sequence without sacrificing performance on any task or exhibiting catastrophic forgetting.
arXiv Detail & Related papers (2022-07-18T15:07:13Z) - Prune and Tune Ensembles: Low-Cost Ensemble Learning With Sparse
Independent Subnetworks [0.0]
We introduce a fast, low-cost method for creating diverse ensembles of neural networks without needing to train multiple models from scratch.
We create child networks by cloning the parent and dramatically pruning the parameters of each child to create an ensemble of members with unique and diverse topologies.
This diversity enables "Prune and Tune" ensembles to achieve results that are competitive with traditional ensembles at a fraction of the training cost.
arXiv Detail & Related papers (2022-02-23T20:53:54Z) - Being Friends Instead of Adversaries: Deep Networks Learn from Data
Simplified by Other Networks [23.886422706697882]
A different idea has been recently proposed, named Friendly Training, which consists in altering the input data by adding an automatically estimated perturbation.
We revisit and extend this idea inspired by the effectiveness of neural generators in the context of Adversarial Machine Learning.
We propose an auxiliary multi-layer network that is responsible of altering the input data to make them easier to be handled by the classifier.
arXiv Detail & Related papers (2021-12-18T16:59:35Z) - LaplaceNet: A Hybrid Energy-Neural Model for Deep Semi-Supervised
Classification [0.0]
Recent developments in deep semi-supervised classification have reached unprecedented performance.
We propose a new framework, LaplaceNet, for deep semi-supervised classification that has a greatly reduced model complexity.
Our model outperforms state-of-the-art methods for deep semi-supervised classification, over several benchmark datasets.
arXiv Detail & Related papers (2021-06-08T17:09:28Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Firefly Neural Architecture Descent: a General Approach for Growing
Neural Networks [50.684661759340145]
Firefly neural architecture descent is a general framework for progressively and dynamically growing neural networks.
We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures.
In particular, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.
arXiv Detail & Related papers (2021-02-17T04:47:18Z) - Local Critic Training for Model-Parallel Learning of Deep Neural
Networks [94.69202357137452]
We propose a novel model-parallel learning method, called local critic training.
We show that the proposed approach successfully decouples the update process of the layer groups for both convolutional neural networks (CNNs) and recurrent neural networks (RNNs)
We also show that trained networks by the proposed method can be used for structural optimization.
arXiv Detail & Related papers (2021-02-03T09:30:45Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - Subset Sampling For Progressive Neural Network Learning [106.12874293597754]
Progressive Neural Network Learning is a class of algorithms that incrementally construct the network's topology and optimize its parameters based on the training data.
We propose to speed up this process by exploiting subsets of training data at each incremental training step.
Experimental results in object, scene and face recognition problems demonstrate that the proposed approach speeds up the optimization procedure considerably.
arXiv Detail & Related papers (2020-02-17T18:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.