Training Lightweight Graph Convolutional Networks with Phase-field
Models
- URL: http://arxiv.org/abs/2212.09415v1
- Date: Mon, 19 Dec 2022 12:49:03 GMT
- Title: Training Lightweight Graph Convolutional Networks with Phase-field
Models
- Authors: Hichem Sahbi
- Abstract summary: We design lightweight graph convolutional networks (GCNs) using a particular class of regularizers, dubbed as phase-field models (PFMs)
PFMs exhibit a bi-phase behavior using a particular ultra-local term that allows training both the topology and the weight parameters of GCNs as a part of a single "end-to-end" optimization problem.
- Score: 12.18340575383456
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we design lightweight graph convolutional networks (GCNs)
using a particular class of regularizers, dubbed as phase-field models (PFMs).
PFMs exhibit a bi-phase behavior using a particular ultra-local term that
allows training both the topology and the weight parameters of GCNs as a part
of a single "end-to-end" optimization problem. Our proposed solution also
relies on a reparametrization that pushes the mask of the topology towards
binary values leading to effective topology selection and high generalization
while implementing any targeted pruning rate. Both masks and weights share the
same set of latent variables and this further enhances the generalization power
of the resulting lightweight GCNs. Extensive experiments conducted on the
challenging task of skeleton-based recognition show the outperformance of PFMs
against other staple regularizers as well as related lightweight design
methods.
Related papers
- Pushing the Limits of Large Language Model Quantization via the Linearity Theorem [71.3332971315821]
We present a "line theoremarity" establishing a direct relationship between the layer-wise $ell$ reconstruction error and the model perplexity increase due to quantization.
This insight enables two novel applications: (1) a simple data-free LLM quantization method using Hadamard rotations and MSE-optimal grids, dubbed HIGGS, and (2) an optimal solution to the problem of finding non-uniform per-layer quantization levels.
arXiv Detail & Related papers (2024-11-26T15:35:44Z) - HG-Adapter: Improving Pre-Trained Heterogeneous Graph Neural Networks with Dual Adapters [53.97380482341493]
"pre-train, prompt-tuning" has demonstrated impressive performance for tuning pre-trained heterogeneous graph neural networks (HGNNs)
We propose a unified framework that combines two new adapters with potential labeled data extension to improve the generalization of pre-trained HGNN models.
arXiv Detail & Related papers (2024-11-02T06:43:54Z) - QT-DoG: Quantization-aware Training for Domain Generalization [58.439816306817306]
We propose Quantization-aware Training for Domain Generalization (QT-DoG)
QT-DoG exploits quantization as an implicit regularizer by inducing noise in model weights.
We demonstrate that QT-DoG generalizes across various datasets, architectures, and quantization algorithms.
arXiv Detail & Related papers (2024-10-08T13:21:48Z) - One-Shot Multi-Rate Pruning of Graph Convolutional Networks [5.656581242851759]
We devise a novel lightweight Graph Convolutional Network (GCN) design dubbed as Multi-Rate Magnitude Pruning (MRMP)
Our method is variational and proceeds by aligning the weight distribution of the learned networks with an a priori distribution.
In the other hand, MRMP achieves a joint training of multiple GCNs, on top of shared weights, in order to extrapolate accurate networks at any targeted pruning rate without retraining their weights.
arXiv Detail & Related papers (2023-12-29T14:20:00Z) - Budget-Aware Graph Convolutional Network Design using Probabilistic
Magnitude Pruning [12.18340575383456]
We devise a novel lightweight Graph convolutional networks (GCNs) design dubbed as Probabilistic Magnitude Pruning (PMP)
Our method is variational and proceeds by aligning the weight distribution of the learned networks with a priori distribution.
Experiments conducted on the challenging task of skeleton-based recognition show a substantial gain of our lightweight GCNs.
arXiv Detail & Related papers (2023-05-30T18:12:13Z) - Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning [16.869553861212548]
Iterative Magnitude Pruning (IMP) still stands as a state-of-the-art algorithm despite its simple nature, particularly in extremely sparse regimes.
We propose Sparse Weight Averaging with Multiple Particles (SWAMP), a straightforward modification of IMP that achieves performance comparable to an ensemble of two IMP solutions.
arXiv Detail & Related papers (2023-05-24T08:01:49Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Towards Lightweight Cross-domain Sequential Recommendation via External
Attention-enhanced Graph Convolution Network [7.1102362215550725]
Cross-domain Sequential Recommendation (CSR) depicts the evolution of behavior patterns for overlapped users by modeling their interactions from multiple domains.
We introduce a lightweight external attention-enhanced GCN-based framework to solve the above challenges, namely LEA-GCN.
To further alleviate the framework structure and aggregate the user-specific sequential pattern, we devise a novel dual-channel External Attention (EA) component.
arXiv Detail & Related papers (2023-02-07T03:06:29Z) - Orthogonal Stochastic Configuration Networks with Adaptive Construction
Parameter for Data Analytics [6.940097162264939]
randomness makes SCNs more likely to generate approximate linear correlative nodes that are redundant and low quality.
In light of a fundamental principle in machine learning, that is, a model with fewer parameters holds improved generalization.
This paper proposes orthogonal SCN, termed OSCN, to filtrate out the low-quality hidden nodes for network structure reduction.
arXiv Detail & Related papers (2022-05-26T07:07:26Z) - Extended Unconstrained Features Model for Exploring Deep Neural Collapse [59.59039125375527]
Recently, a phenomenon termed "neural collapse" (NC) has been empirically observed in deep neural networks.
Recent papers have shown that minimizers with this structure emerge when optimizing a simplified "unconstrained features model"
In this paper, we study the UFM for the regularized MSE loss, and show that the minimizers' features can be more structured than in the cross-entropy case.
arXiv Detail & Related papers (2022-02-16T14:17:37Z) - Improve Generalization and Robustness of Neural Networks via Weight
Scale Shifting Invariant Regularizations [52.493315075385325]
We show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with homogeneous activation functions.
We propose an improved regularizer that is invariant to weight scale shifting and thus effectively constrains the intrinsic norm of a neural network.
arXiv Detail & Related papers (2020-08-07T02:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.