Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network
Pruning
- URL: http://arxiv.org/abs/2201.12712v1
- Date: Sun, 30 Jan 2022 03:42:36 GMT
- Title: Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network
Pruning
- Authors: Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
- Abstract summary: optimal network pruning is a non-trivial task which mathematically is an NP-hard problem.
In this paper, we investigate the Magnitude-Based Pruning (MBP) scheme and analyze it from a novel perspective.
We also propose a novel two-stage pruning approach, where one stage is to obtain the topological structure of the pruned network and the other stage is to retrain the pruned network to recover the capacity.
- Score: 50.232218214751455
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: With the remarkable success of deep learning recently, efficient network
compression algorithms are urgently demanded for releasing the potential
computational power of edge devices, such as smartphones or tablets. However,
optimal network pruning is a non-trivial task which mathematically is an
NP-hard problem. Previous researchers explain training a pruned network as
buying a lottery ticket. In this paper, we investigate the Magnitude-Based
Pruning (MBP) scheme and analyze it from a novel perspective through Fourier
analysis on the deep learning model to guide model designation. Besides
explaining the generalization ability of MBP using Fourier transform, we also
propose a novel two-stage pruning approach, where one stage is to obtain the
topological structure of the pruned network and the other stage is to retrain
the pruned network to recover the capacity using knowledge distillation from
lower to higher on the frequency domain. Extensive experiments on CIFAR-10 and
CIFAR-100 demonstrate the superiority of our novel Fourier analysis based MBP
compared to other traditional MBP algorithms.
Related papers
- Efficient Training of Deep Neural Operator Networks via Randomized Sampling [0.0]
Deep operator network (DeepNet) has demonstrated success in the real-time prediction of complex dynamics across various scientific and engineering applications.
We introduce a random sampling technique to be adopted the training of DeepONet, aimed at improving generalization ability of the model, while significantly reducing computational time.
Our results indicate that incorporating randomization in the trunk network inputs during training enhances the efficiency and robustness of DeepONet, offering a promising avenue for improving the framework's performance in modeling complex physical systems.
arXiv Detail & Related papers (2024-09-20T07:18:31Z) - Robust Fourier Neural Networks [1.0589208420411014]
We show that introducing a simple diagonal layer after the Fourier embedding layer makes the network more robust to measurement noise.
Under certain conditions, our proposed approach can also learn functions that are noisy mixtures of nonlinear functions of Fourier features.
arXiv Detail & Related papers (2024-09-03T16:56:41Z) - Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning [14.792099973449794]
We propose an algorithm to align the training dynamics of the sparse network with that of the dense one.
We show how the usually neglected data-dependent component in the NTK's spectrum can be taken into account.
Path eXclusion (PX) is able to find lottery tickets even at high sparsity levels.
arXiv Detail & Related papers (2024-06-03T22:19:42Z) - Properties and Potential Applications of Random Functional-Linked Types
of Neural Networks [81.56822938033119]
Random functional-linked neural networks (RFLNNs) offer an alternative way of learning in deep structure.
This paper gives some insights into the properties of RFLNNs from the viewpoints of frequency domain.
We propose a method to generate a BLS network with better performance, and design an efficient algorithm for solving Poison's equation.
arXiv Detail & Related papers (2023-04-03T13:25:22Z) - Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time.
This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems.
We numerically demonstrate that the MNIST image dataset satisfies such conditions.
We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z) - Attentive Gaussian processes for probabilistic time-series generation [4.94950858749529]
We propose a computationally efficient attention-based network combined with the Gaussian process regression to generate real-valued sequence.
We develop a block-wise training algorithm to allow mini-batch training of the network while the GP is trained using full-batch.
The algorithm has been proved to converge and shows comparable, if not better, quality of the found solution.
arXiv Detail & Related papers (2021-02-10T01:19:15Z) - DeepPhaseCut: Deep Relaxation in Phase for Unsupervised Fourier Phase
Retrieval [31.380061715549584]
We propose a novel, unsupervised, feed-forward neural network for Fourier phase retrieval.
Unlike the existing deep learning approaches that use a neural network as a regularization term or an end-to-end blackbox model for supervised training, our algorithm is a feed-forward neural network implementation of PhaseCut algorithm in an unsupervised learning framework.
Our network is composed of two generators: one for the phase estimation using PhaseCut loss, followed by another generator for image reconstruction, all of which are trained simultaneously using a cycleGAN framework without matched data.
arXiv Detail & Related papers (2020-11-20T16:10:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.