PFGE: Parsimonious Fast Geometric Ensembling of DNNs
- URL: http://arxiv.org/abs/2202.06658v8
- Date: Tue, 23 May 2023 08:53:32 GMT
- Title: PFGE: Parsimonious Fast Geometric Ensembling of DNNs
- Authors: Hao Guo, Jiyong Jin, Bin Liu
- Abstract summary: In this paper, we propose a new method called parsimonious FGE (PFGE), which employs a lightweight ensemble of higher-performing deep neural networks.
Our results show PFGE 5x memory efficiency compared to previous methods, without compromising on generalization performance.
- Score: 6.973476713852153
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Ensemble methods are commonly used to enhance the generalization performance
of machine learning models. However, they present a challenge in deep learning
systems due to the high computational overhead required to train an ensemble of
deep neural networks (DNNs). Recent advancements such as fast geometric
ensembling (FGE) and snapshot ensembles have addressed this issue by training
model ensembles in the same time as a single model. Nonetheless, these
techniques still require additional memory for test-time inference compared to
single-model-based methods. In this paper, we propose a new method called
parsimonious FGE (PFGE), which employs a lightweight ensemble of
higher-performing DNNs generated through successive stochastic weight averaging
procedures. Our experimental results on CIFAR-{10,100} and ImageNet datasets
across various modern DNN architectures demonstrate that PFGE achieves 5x
memory efficiency compared to previous methods, without compromising on
generalization performance. For those interested, our code is available at
https://github.com/ZJLAB-AMMI/PFGE.
Related papers
- Fast Ensembling with Diffusion Schrödinger Bridge [17.334437293164566]
Deep Ensemble (DE) approach is a straightforward technique used to enhance the performance of deep neural networks by training them from different initial points, converging towards various local optima.
We propose a novel approach called Diffusion Bridge Network (DBN) to address this challenge.
By substituting the heavy ensembles with this lightweight neural network DBN, we achieved inference with reduced computational cost while maintaining accuracy and uncertainty scores on benchmark datasets such as CIFAR-10, CIFAR-100, and TinyImageNet.
arXiv Detail & Related papers (2024-04-24T11:35:02Z) - Ensemble Quadratic Assignment Network for Graph Matching [52.20001802006391]
Graph matching is a commonly used technique in computer vision and pattern recognition.
Recent data-driven approaches have improved the graph matching accuracy remarkably.
We propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods.
arXiv Detail & Related papers (2024-03-11T06:34:05Z) - Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - MS-RNN: A Flexible Multi-Scale Framework for Spatiotemporal Predictive
Learning [7.311071760653835]
We propose a general framework named Multi-Scale RNN (MS-RNN) to boost recent RNN models for predictive learning.
We verify the MS-RNN framework by thorough theoretical analyses and exhaustive experiments.
Results show the efficiency that RNN models incorporating our framework have much lower memory cost but better performance than before.
arXiv Detail & Related papers (2022-06-07T04:57:58Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Collegial Ensembles [11.64359837358763]
We show that collegial ensembles can be efficiently implemented in practical architectures using group convolutions and block diagonal layers.
We also show how our framework can be used to analytically derive optimal group convolution modules without having to train a single model.
arXiv Detail & Related papers (2020-06-13T16:40:26Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.