Related papers: Mutual Information Preserving Neural Network Pruning

Related papers

Information Consistent Pruning: How to Efficiently Search for Sparse Networks? [5.524804393257921]
Iterative magnitude pruning methods (IMPs) are proven to be successful in reducing the number of insignificant nodes in deep neural networks (DNNs) Despite IMPs popularity in pruning networks, a fundamental limitation of existing IMP algorithms is the significant training time required for each pruning gradient. Our paper introduces a novel textitstopping criterion for IMPs that monitors information and flows between networks layers and minimizes the training time.
arXiv Detail & Related papers (2025-01-26T16:40:59Z)
Learning effective pruning at initialization from iterative pruning [15.842658282636876]
We present an end-to-end neural network-based PaI method to reduce training costs. Our approach outperforms existing methods in high-sparsity settings. As the first neural network-based PaI method, we conduct extensive experiments to validate the factors influencing this approach.
arXiv Detail & Related papers (2024-08-27T03:17:52Z)
Concurrent Training and Layer Pruning of Deep Neural Networks [0.0]
We propose an algorithm capable of identifying and eliminating irrelevant layers of a neural network during the early stages of training. We employ a structure using residual connections around nonlinear network sections that allow the flow of information through the network once a nonlinear section is pruned.
arXiv Detail & Related papers (2024-06-06T23:19:57Z)
Neural Maximum A Posteriori Estimation on Unpaired Data for Motion Deblurring [87.97330195531029]
We propose a Neural Maximum A Posteriori (NeurMAP) estimation framework for training neural networks to recover blind motion information and sharp content from unpaired data. The proposed NeurMAP is an approach to existing deblurring neural networks, and is the first framework that enables training image deblurring networks on unpaired datasets.
arXiv Detail & Related papers (2022-04-26T08:09:47Z)
Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training [55.43088293183165]
Recent studies show that pre-trained language models (PLMs) like BERT contain matchingworks that have similar transfer learning performance as the original PLM. In this paper, we find that the BERTworks have even more potential than these studies have shown. We train binary masks over model weights on the pre-training tasks, with the aim of preserving the universal transferability of the subnetwork.
arXiv Detail & Related papers (2022-04-24T08:42:47Z)
Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients [36.078414964088196]
Pruning neural networks at initialization would enable us to find sparse models that retain the accuracy of the original network. Current methods are insufficient to enable this optimization and lead to a large degradation in model performance. We propose Prospect Pruning (ProsPr), which uses meta-gradients through the first few steps of optimization to determine which weights to prune. Our method achieves state-of-the-art pruning performance on a variety of vision classification tasks, with less data and in a single shot compared to existing pruning-at-initialization methods.
arXiv Detail & Related papers (2022-02-16T15:18:55Z)
Edge-Enhanced Global Disentangled Graph Neural Network for Sequential Recommendation [44.15486708923762]
We propose an Edge-Enhanced Global Disentangled Graph Neural Network (EGD-GNN) model to capture the relation information between items. At the global level, we build a global-link graph over all sequences to model item relationships. At the local level, we apply a variational auto-encoder framework to learn user intention over the current sequence.
arXiv Detail & Related papers (2021-11-20T08:15:20Z)
SSSNET: Semi-Supervised Signed Network Clustering [4.895808607591299]
We introduce a novel probabilistic balanced normalized cut loss for training nodes in a GNN framework for semi-supervised signed network clustering, called SSSNET. The main novelty approach is a new take on the role of social balance theory for signed network embeddings.
arXiv Detail & Related papers (2021-10-13T10:36:37Z)
Sparse Training via Boosting Pruning Plasticity with Neuroregeneration [79.78184026678659]
We study the effect of pruning throughout training from the perspective of pruning plasticity. We design a novel gradual magnitude pruning (GMP) method, named gradual pruning with zero-cost neuroregeneration (GraNet) and its dynamic sparse training (DST) variant (GraNet-ST) Perhaps most impressively, the latter for the first time boosts the sparse-to-sparse training performance over various dense-to-sparse methods by a large margin with ResNet-50 on ImageNet.
arXiv Detail & Related papers (2021-06-19T02:09:25Z)
Learning Deep Interleaved Networks with Asymmetric Co-Attention for Image Restoration [65.11022516031463]
We present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction. In this paper, we propose asymmetric co-attention (AsyCA) which is attached at each interleaved node to model the feature dependencies. Our presented DIN can be trained end-to-end and applied to various image restoration tasks.
arXiv Detail & Related papers (2020-10-29T15:32:00Z)
DINE: A Framework for Deep Incomplete Network Embedding [33.97952453310253]
We propose a Deep Incomplete Network Embedding method, namely DINE. We first complete the missing part including both nodes and edges in a partially observable network by using the expectation-maximization framework. We evaluate DINE over three networks on multi-label classification and link prediction tasks.
arXiv Detail & Related papers (2020-08-09T04:59:35Z)
GloDyNE: Global Topology Preserving Dynamic Network Embedding [31.269883917366478]
Dynamic Network Embedding (DNE) aims to update node embeddings while preserving network topology at each time step. We propose a novel strategy to diversely select the representative nodes over a network, which is coordinated with a new incremental learning paradigm. Experiments show GloDyNE, with a small fraction of nodes being selected, can already achieve the superior or comparable performance.
arXiv Detail & Related papers (2020-08-05T05:10:15Z)
Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network. PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
Policy-GNN: Aggregation Optimization for Graph Neural Networks [60.50932472042379]
Graph neural networks (GNNs) aim to model the local graph structures and capture the hierarchical patterns by aggregating the information from neighbors. It is a challenging task to develop an effective aggregation strategy for each node, given complex graphs and sparse features. We propose Policy-GNN, a meta-policy framework that models the sampling procedure and message passing of GNNs into a combined learning process.
arXiv Detail & Related papers (2020-06-26T17:03:06Z)
Graph Prototypical Networks for Few-shot Learning on Attributed Networks [72.31180045017835]
We propose a graph meta-learning framework -- Graph Prototypical Networks (GPN) GPN is able to perform textitmeta-learning on an attributed network and derive a highly generalizable model for handling the target classification task.
arXiv Detail & Related papers (2020-06-23T04:13:23Z)
Pruning neural networks without any data by iteratively conserving synaptic flow [27.849332212178847]
Pruning the parameters of deep neural networks has generated intense interest due to potential savings in time, memory and energy. Recent works have identified, through an expensive sequence of training and pruning cycles, the existence of winning lottery tickets or sparse trainableworks. We provide an affirmative answer to this question through theory driven algorithm design.
arXiv Detail & Related papers (2020-06-09T19:21:57Z)
Unsupervised Differentiable Multi-aspect Network Embedding [52.981277420394846]
We propose a novel end-to-end framework for multi-aspect network embedding, called asp2vec. Our proposed framework can be readily extended to heterogeneous networks.
arXiv Detail & Related papers (2020-06-07T19:26:20Z)
Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix. Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
Robust Pruning at Initialization [61.30574156442608]
A growing need for smaller, energy-efficient, neural networks to be able to use machine learning applications on devices with limited computational resources. For Deep NNs, such procedures remain unsatisfactory as the resulting pruned networks can be difficult to train and, for instance, they do not prevent one layer from being fully pruned.
arXiv Detail & Related papers (2020-02-19T17:09:50Z)
MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. The use of gradient combined nonvolutionity renders learning susceptible to novel problems. We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
EdgeNets:Edge Varying Graph Neural Networks [179.99395949679547]
This paper puts forth a general framework that unifies state-of-the-art graph neural networks (GNNs) through the concept of EdgeNet. An EdgeNet is a GNN architecture that allows different nodes to use different parameters to weigh the information of different neighbors. This is a general linear and local operation that a node can perform and encompasses under one formulation all existing graph convolutional neural networks (GCNNs) as well as graph attention networks (GATs)
arXiv Detail & Related papers (2020-01-21T15:51:17Z)
A Block-based Generative Model for Attributed Networks Embedding [42.00826538556588]
We propose a block-based generative model for attributed network embedding from a probability perspective. We use a neural network to characterize the nonlinearity between node embeddings and node attributes. The results show that our proposed method consistently outperforms state-of-the-art embedding methods for both clustering and classification tasks.
arXiv Detail & Related papers (2020-01-06T03:44:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.