Towards Meta-Pruning via Optimal Transport
- URL: http://arxiv.org/abs/2402.07839v2
- Date: Tue, 13 Feb 2024 13:19:54 GMT
- Title: Towards Meta-Pruning via Optimal Transport
- Authors: Alexander Theus, Olin Geimer, Friedrich Wicke, Thomas Hofmann, Sotiris
Anagnostidis, Sidak Pal Singh
- Abstract summary: This paper introduces a novel approach named Intra-Fusion, challenging the prevailing pruning paradigm.
We leverage the concepts of model fusion and Optimal Transport to arrive at a more effective sparse model representation.
We benchmark our results for various networks on commonly used datasets such as CIFAR-10, CIFAR-100, and ImageNet.
- Score: 64.6060250923073
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Structural pruning of neural networks conventionally relies on identifying
and discarding less important neurons, a practice often resulting in
significant accuracy loss that necessitates subsequent fine-tuning efforts.
This paper introduces a novel approach named Intra-Fusion, challenging this
prevailing pruning paradigm. Unlike existing methods that focus on designing
meaningful neuron importance metrics, Intra-Fusion redefines the overlying
pruning procedure. Through utilizing the concepts of model fusion and Optimal
Transport, we leverage an agnostically given importance metric to arrive at a
more effective sparse model representation. Notably, our approach achieves
substantial accuracy recovery without the need for resource-intensive
fine-tuning, making it an efficient and promising tool for neural network
compression.
Additionally, we explore how fusion can be added to the pruning process to
significantly decrease the training time while maintaining competitive
performance. We benchmark our results for various networks on commonly used
datasets such as CIFAR-10, CIFAR-100, and ImageNet. More broadly, we hope that
the proposed Intra-Fusion approach invigorates exploration into a fresh
alternative to the predominant compression approaches. Our code is available
here: https://github.com/alexandertheus/Intra-Fusion.
Related papers
- Affine-based Deformable Attention and Selective Fusion for Semi-dense Matching [30.272791354494373]
We introduce affine-based local attention to model cross-view deformations.
We also present selective fusion to merge local and global messages from cross attention.
arXiv Detail & Related papers (2024-05-22T17:57:37Z) - T-Fusion Net: A Novel Deep Neural Network Augmented with Multiple
Localizations based Spatial Attention Mechanisms for Covid-19 Detection [0.7614628596146599]
The present work proposes a new deep neural network (called as, T-Fusion Net) that augments multiple localizations based spatial attention.
A homogeneous ensemble of the said network is further used to enhance image classification accuracy.
The proposed T-Fusion Net and the homogeneous ensemble model exhibit better performance, as compared to other state-of-the-art methods.
arXiv Detail & Related papers (2023-07-31T18:18:01Z) - Deep Multi-Threshold Spiking-UNet for Image Processing [51.88730892920031]
This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture.
To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy.
Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart.
arXiv Detail & Related papers (2023-07-20T16:00:19Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.