Related papers: PFGE: Parsimonious Fast Geometric Ensembling of DNNs

PFGE: Parsimonious Fast Geometric Ensembling of DNNs

URL: http://arxiv.org/abs/2202.06658v8
Date: Tue, 23 May 2023 08:53:32 GMT
Title: PFGE: Parsimonious Fast Geometric Ensembling of DNNs
Authors: Hao Guo, Jiyong Jin, Bin Liu
Abstract summary: In this paper, we propose a new method called parsimonious FGE (PFGE), which employs a lightweight ensemble of higher-performing deep neural networks. Our results show PFGE 5x memory efficiency compared to previous methods, without compromising on generalization performance.
Score: 6.973476713852153
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Ensemble methods are commonly used to enhance the generalization performance of machine learning models. However, they present a challenge in deep learning systems due to the high computational overhead required to train an ensemble of deep neural networks (DNNs). Recent advancements such as fast geometric ensembling (FGE) and snapshot ensembles have addressed this issue by training model ensembles in the same time as a single model. Nonetheless, these techniques still require additional memory for test-time inference compared to single-model-based methods. In this paper, we propose a new method called parsimonious FGE (PFGE), which employs a lightweight ensemble of higher-performing DNNs generated through successive stochastic weight averaging procedures. Our experimental results on CIFAR-{10,100} and ImageNet datasets across various modern DNN architectures demonstrate that PFGE achieves 5x memory efficiency compared to previous methods, without compromising on generalization performance. For those interested, our code is available at https://github.com/ZJLAB-AMMI/PFGE.

Related papers

GNNMerge: Merging of GNN Models Without Accessing Training Data [12.607714697138428]
Model merging has gained prominence in machine learning as a method to integrate multiple trained models into a single model without accessing the original training data. Existing approaches have demonstrated success in domains such as computer vision and NLP, their application to Graph Neural Networks (GNNs) remains unexplored. We propose GNNMerge, which utilizes a task-agnostic node embedding alignment strategy to merge GNNs.
arXiv Detail & Related papers (2025-03-05T11:02:29Z)
Fast Ensembling with Diffusion Schrödinger Bridge [17.334437293164566]
Deep Ensemble (DE) approach is a straightforward technique used to enhance the performance of deep neural networks by training them from different initial points, converging towards various local optima. We propose a novel approach called Diffusion Bridge Network (DBN) to address this challenge. By substituting the heavy ensembles with this lightweight neural network DBN, we achieved inference with reduced computational cost while maintaining accuracy and uncertainty scores on benchmark datasets such as CIFAR-10, CIFAR-100, and TinyImageNet.
arXiv Detail & Related papers (2024-04-24T11:35:02Z)
Ensemble Quadratic Assignment Network for Graph Matching [52.20001802006391]
Graph matching is a commonly used technique in computer vision and pattern recognition. Recent data-driven approaches have improved the graph matching accuracy remarkably. We propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods.
arXiv Detail & Related papers (2024-03-11T06:34:05Z)
Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks. By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead. We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs) We present a new ensembling training manner, named EnGCN, to address the existing issues. Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z)
MS-RNN: A Flexible Multi-Scale Framework for Spatiotemporal Predictive Learning [7.311071760653835]
We propose a general framework named Multi-Scale RNN (MS-RNN) to boost recent RNN models for predictive learning. We verify the MS-RNN framework by thorough theoretical analyses and exhaustive experiments. Results show the efficiency that RNN models incorporating our framework have much lower memory cost but better performance than before.
arXiv Detail & Related papers (2022-06-07T04:57:58Z)
Rank-R FNN: A Tensor-Based Learning Model for High-Order Data Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters. First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension. We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z)
Collegial Ensembles [11.64359837358763]
We show that collegial ensembles can be efficiently implemented in practical architectures using group convolutions and block diagonal layers. We also show how our framework can be used to analytically derive optimal group convolution modules without having to train a single model.
arXiv Detail & Related papers (2020-06-13T16:40:26Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.