Visual Prompting Upgrades Neural Network Sparsification: A Data-Model
Perspective
- URL: http://arxiv.org/abs/2312.01397v2
- Date: Thu, 14 Dec 2023 06:52:11 GMT
- Title: Visual Prompting Upgrades Neural Network Sparsification: A Data-Model
Perspective
- Authors: Can Jin, Tianjin Huang, Yihua Zhang, Mykola Pechenizkiy, Sijia Liu,
Shiwei Liu, Tianlong Chen
- Abstract summary: We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
- Score: 67.25782152459851
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid development of large-scale deep learning models questions the
affordability of hardware platforms, which necessitates the pruning to reduce
their computational and memory footprints. Sparse neural networks as the
product, have demonstrated numerous favorable benefits like low complexity,
undamaged generalization, etc. Most of the prominent pruning strategies are
invented from a model-centric perspective, focusing on searching and preserving
crucial weights by analyzing network topologies. However, the role of data and
its interplay with model-centric pruning has remained relatively unexplored. In
this research, we introduce a novel data-model co-design perspective: to
promote superior weight sparsity by learning important model topology and
adequate input data in a synergetic manner. Specifically, customized Visual
Prompts are mounted to upgrade neural Network sparsification in our proposed
VPNs framework. As a pioneering effort, this paper conducts systematic
investigations about the impact of different visual prompts on model pruning
and suggests an effective joint optimization approach. Extensive experiments
with 3 network architectures and 8 datasets evidence the substantial
performance improvements from VPNs over existing start-of-the-art pruning
algorithms. Furthermore, we find that subnetworks discovered by VPNs from
pre-trained models enjoy better transferability across diverse downstream
scenarios. These insights shed light on new promising possibilities of
data-model co-designs for vision model sparsification.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Efficient Online Processing with Deep Neural Networks [1.90365714903665]
This dissertation is dedicated to the neural network efficiency. Specifically, a core contribution addresses the efficiency aspects during online inference.
These advances are attained through a bottomup computational reorganization and judicious architectural modifications.
arXiv Detail & Related papers (2023-06-23T12:29:44Z) - Exploiting Large Neuroimaging Datasets to Create Connectome-Constrained
Approaches for more Robust, Efficient, and Adaptable Artificial Intelligence [4.998666322418252]
We envision a pipeline to utilize large neuroimaging datasets, including maps of the brain.
We have developed a technique for discovery of repeated subcircuits, or motifs.
Third, the team analyzed circuitry for memory formation in the fruit fly connectome, enabling the design of a novel generative replay approach.
arXiv Detail & Related papers (2023-05-26T23:04:53Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Interpretability of an Interaction Network for identifying $H
\rightarrow b\bar{b}$ jets [4.553120911976256]
In recent times, AI models based on deep neural networks are becoming increasingly popular for many of these applications.
We explore interpretability of AI models by examining an Interaction Network (IN) model designed to identify boosted $Hto bbarb$ jets.
We additionally illustrate the activity of hidden layers within the IN model as Neural Activation Pattern (NAP) diagrams.
arXiv Detail & Related papers (2022-11-23T08:38:52Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - "Understanding Robustness Lottery": A Geometric Visual Comparative
Analysis of Neural Network Pruning Approaches [29.048660060344574]
This work aims to shed light on how different pruning methods alter the network's internal feature representation and the corresponding impact on model performance.
We introduce a visual geometric analysis of feature representations to compare and highlight the impact of pruning on model performance and feature representation.
The proposed tool provides an environment for in-depth comparison of pruning methods and a comprehensive understanding of how model response to common data corruption.
arXiv Detail & Related papers (2022-06-16T04:44:13Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - The Self-Simplifying Machine: Exploiting the Structure of Piecewise
Linear Neural Networks to Create Interpretable Models [0.0]
We introduce novel methodology toward simplification and increased interpretability of Piecewise Linear Neural Networks for classification tasks.
Our methods include the use of a trained, deep network to produce a well-performing, single-hidden-layer network without further training.
On these methods, we conduct preliminary studies of model performance, as well as a case study on Wells Fargo's Home Lending dataset.
arXiv Detail & Related papers (2020-12-02T16:02:14Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.