Related papers: Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network Pruning

Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network Pruning

URL: http://arxiv.org/abs/2201.12712v1
Date: Sun, 30 Jan 2022 03:42:36 GMT
Title: Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network Pruning
Authors: Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
Abstract summary: optimal network pruning is a non-trivial task which mathematically is an NP-hard problem. In this paper, we investigate the Magnitude-Based Pruning (MBP) scheme and analyze it from a novel perspective. We also propose a novel two-stage pruning approach, where one stage is to obtain the topological structure of the pruned network and the other stage is to retrain the pruned network to recover the capacity.
Score: 50.232218214751455
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: With the remarkable success of deep learning recently, efficient network compression algorithms are urgently demanded for releasing the potential computational power of edge devices, such as smartphones or tablets. However, optimal network pruning is a non-trivial task which mathematically is an NP-hard problem. Previous researchers explain training a pruned network as buying a lottery ticket. In this paper, we investigate the Magnitude-Based Pruning (MBP) scheme and analyze it from a novel perspective through Fourier analysis on the deep learning model to guide model designation. Besides explaining the generalization ability of MBP using Fourier transform, we also propose a novel two-stage pruning approach, where one stage is to obtain the topological structure of the pruned network and the other stage is to retrain the pruned network to recover the capacity using knowledge distillation from lower to higher on the frequency domain. Extensive experiments on CIFAR-10 and CIFAR-100 demonstrate the superiority of our novel Fourier analysis based MBP compared to other traditional MBP algorithms.

Related papers

Random at First, Fast at Last: NTK-Guided Fourier Pre-Processing for Tabular DL [4.6774351030379835]
We revisit and repurpose random Fourier mappings as a parameter-free, architecture-agnostic transformation.<n>We show that this approach circumvents the need for ad hoc normalization or additional learnable embeddings.<n> Empirically, we demonstrate that deep networks trained on Fourier-transformed inputs converge more rapidly and consistently achieve strong final performance.
arXiv Detail & Related papers (2025-06-03T03:45:13Z)
Information Consistent Pruning: How to Efficiently Search for Sparse Networks? [5.524804393257921]
Iterative magnitude pruning methods (IMPs) are proven to be successful in reducing the number of insignificant nodes in deep neural networks (DNNs) Despite IMPs popularity in pruning networks, a fundamental limitation of existing IMP algorithms is the significant training time required for each pruning gradient. Our paper introduces a novel textitstopping criterion for IMPs that monitors information and flows between networks layers and minimizes the training time.
arXiv Detail & Related papers (2025-01-26T16:40:59Z)
Efficient Training of Deep Neural Operator Networks via Randomized Sampling [0.0]
Deep operator network (DeepNet) has demonstrated success in the real-time prediction of complex dynamics across various scientific and engineering applications. We introduce a random sampling technique to be adopted the training of DeepONet, aimed at improving generalization ability of the model, while significantly reducing computational time. Our results indicate that incorporating randomization in the trunk network inputs during training enhances the efficiency and robustness of DeepONet, offering a promising avenue for improving the framework's performance in modeling complex physical systems.
arXiv Detail & Related papers (2024-09-20T07:18:31Z)
Robust Fourier Neural Networks [1.0589208420411014]
We show that introducing a simple diagonal layer after the Fourier embedding layer makes the network more robust to measurement noise. Under certain conditions, our proposed approach can also learn functions that are noisy mixtures of nonlinear functions of Fourier features.
arXiv Detail & Related papers (2024-09-03T16:56:41Z)
Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning [14.792099973449794]
We propose an algorithm to align the training dynamics of the sparse network with that of the dense one. We show how the usually neglected data-dependent component in the NTK's spectrum can be taken into account. Path eXclusion (PX) is able to find lottery tickets even at high sparsity levels.
arXiv Detail & Related papers (2024-06-03T22:19:42Z)
Properties and Potential Applications of Random Functional-Linked Types of Neural Networks [81.56822938033119]
Random functional-linked neural networks (RFLNNs) offer an alternative way of learning in deep structure. This paper gives some insights into the properties of RFLNNs from the viewpoints of frequency domain. We propose a method to generate a BLS network with better performance, and design an efficient algorithm for solving Poison's equation.
arXiv Detail & Related papers (2023-04-03T13:25:22Z)
Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time. This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z)
Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs) PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z)
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function. We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z)
A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems. We numerically demonstrate that the MNIST image dataset satisfies such conditions. We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z)
Attentive Gaussian processes for probabilistic time-series generation [4.94950858749529]
We propose a computationally efficient attention-based network combined with the Gaussian process regression to generate real-valued sequence. We develop a block-wise training algorithm to allow mini-batch training of the network while the GP is trained using full-batch. The algorithm has been proved to converge and shows comparable, if not better, quality of the found solution.
arXiv Detail & Related papers (2021-02-10T01:19:15Z)
DeepPhaseCut: Deep Relaxation in Phase for Unsupervised Fourier Phase Retrieval [31.380061715549584]
We propose a novel, unsupervised, feed-forward neural network for Fourier phase retrieval. Unlike the existing deep learning approaches that use a neural network as a regularization term or an end-to-end blackbox model for supervised training, our algorithm is a feed-forward neural network implementation of PhaseCut algorithm in an unsupervised learning framework. Our network is composed of two generators: one for the phase estimation using PhaseCut loss, followed by another generator for image reconstruction, all of which are trained simultaneously using a cycleGAN framework without matched data.
arXiv Detail & Related papers (2020-11-20T16:10:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.