ContraNorm: A Contrastive Learning Perspective on Oversmoothing and
Beyond
- URL: http://arxiv.org/abs/2303.06562v2
- Date: Tue, 2 May 2023 13:38:34 GMT
- Title: ContraNorm: A Contrastive Learning Perspective on Oversmoothing and
Beyond
- Authors: Xiaojun Guo, Yifei Wang, Tianqi Du, Yisen Wang
- Abstract summary: Oversmoothing is a common phenomenon in a wide range of Graph Neural Networks (GNNs) and Transformers.
We propose a novel normalization layer called ContraNorm, which implicitly shatters representations in the embedding space.
Our proposed normalization layer can be easily integrated into GNNs and Transformers with negligible parameter overhead.
- Score: 13.888935924826903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Oversmoothing is a common phenomenon in a wide range of Graph Neural Networks
(GNNs) and Transformers, where performance worsens as the number of layers
increases. Instead of characterizing oversmoothing from the view of complete
collapse in which representations converge to a single point, we dive into a
more general perspective of dimensional collapse in which representations lie
in a narrow cone. Accordingly, inspired by the effectiveness of contrastive
learning in preventing dimensional collapse, we propose a novel normalization
layer called ContraNorm. Intuitively, ContraNorm implicitly shatters
representations in the embedding space, leading to a more uniform distribution
and a slighter dimensional collapse. On the theoretical analysis, we prove that
ContraNorm can alleviate both complete collapse and dimensional collapse under
certain conditions. Our proposed normalization layer can be easily integrated
into GNNs and Transformers with negligible parameter overhead. Experiments on
various real-world datasets demonstrate the effectiveness of our proposed
ContraNorm. Our implementation is available at
https://github.com/PKU-ML/ContraNorm.
Related papers
- A Signed Graph Approach to Understanding and Mitigating Oversmoothing in GNNs [54.62268052283014]
We present a unified theoretical perspective based on the framework of signed graphs.<n>We show that many existing strategies implicitly introduce negative edges that alter message-passing to resist oversmoothing.<n>We propose Structural Balanced Propagation (SBP), a plug-and-play method that assigns signed edges based on either labels or feature similarity.
arXiv Detail & Related papers (2025-02-17T03:25:36Z) - On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning [15.409865070022951]
Graph Neural Networks (GNNs) are models that leverage the graph structure to transmit information between nodes.
We show that a simple state-space formulation of a GNN effectively alleviates over-smoothing and over-squashing at no extra trainable parameter cost.
arXiv Detail & Related papers (2025-02-15T14:43:41Z) - Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs [30.003409099607204]
We provide a formal and precise characterization of (linearized) graph neural networks (GNNs) with residual connections and normalization layers.
We show that the centering step of a normalization layer alters the graph signal in message-passing in such a way that relevant information can become harder to extract.
We introduce a novel, principled normalization layer called GraphNormv2 in which the centering step is learned such that it does not distort the original graph signal in an undesirable way.
arXiv Detail & Related papers (2024-06-05T06:53:16Z) - Alignment and Outer Shell Isotropy for Hyperbolic Graph Contrastive
Learning [69.6810940330906]
We propose a novel contrastive learning framework to learn high-quality graph embedding.
Specifically, we design the alignment metric that effectively captures the hierarchical data-invariant information.
We show that in the hyperbolic space one has to address the leaf- and height-level uniformity which are related to properties of trees.
arXiv Detail & Related papers (2023-10-27T15:31:42Z) - OrthoReg: Improving Graph-regularized MLPs via Orthogonality
Regularization [66.30021126251725]
Graph Neural Networks (GNNs) are currently dominating in modeling graphstructure data.
Graph-regularized networks (GR-MLPs) implicitly inject the graph structure information into model weights, while their performance can hardly match that of GNNs in most tasks.
We show that GR-MLPs suffer from dimensional collapse, a phenomenon in which the largest a few eigenvalues dominate the embedding space.
We propose OrthoReg, a novel GR-MLP model to mitigate the dimensional collapse issue.
arXiv Detail & Related papers (2023-01-31T21:20:48Z) - What Does the Gradient Tell When Attacking the Graph Structure [44.44204591087092]
We present a theoretical demonstration revealing that attackers tend to increase inter-class edges due to the message passing mechanism of GNNs.
By connecting dissimilar nodes, attackers can more effectively corrupt node features, making such attacks more advantageous.
We propose an innovative attack loss that balances attack effectiveness and imperceptibility, sacrificing some attack effectiveness to attain greater imperceptibility.
arXiv Detail & Related papers (2022-08-26T15:45:20Z) - Mixed Graph Contrastive Network for Semi-Supervised Node Classification [63.924129159538076]
We propose a novel graph contrastive learning method, termed Mixed Graph Contrastive Network (MGCN)
In our method, we improve the discriminative capability of the latent embeddings by an unperturbed augmentation strategy and a correlation reduction mechanism.
By combining the two settings, we extract rich supervision information from both the abundant nodes and the rare yet valuable labeled nodes for discriminative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - Revisiting Over-smoothing in BERT from the Perspective of Graph [111.24636158179908]
Recently over-smoothing phenomenon of Transformer-based models is observed in both vision and language fields.
We find that layer normalization plays a key role in the over-smoothing issue of Transformer-based models.
We consider hierarchical fusion strategies, which combine the representations from different layers adaptively to make the output more diverse.
arXiv Detail & Related papers (2022-02-17T12:20:52Z) - SkipNode: On Alleviating Performance Degradation for Deep Graph
Convolutional Networks [84.30721808557871]
We conduct theoretical and experimental analysis to explore the fundamental causes of performance degradation in deep GCNs.
We propose a simple yet effective plug-and-play module, Skipnode, to overcome the performance degradation of deep GCNs.
arXiv Detail & Related papers (2021-12-22T02:18:31Z) - The Equilibrium Hypothesis: Rethinking implicit regularization in Deep
Neural Networks [1.7188280334580197]
Modern Deep Neural Networks (DNNs) exhibit impressive generalization properties on a variety of tasks without explicit regularization.
Recent work by Baratin et al. (2021) sheds light on an intriguing implicit regularization effect, showing that some layers are much more aligned with data labels than other layers.
This suggests that as the network grows in depth and width, an implicit layer selection phenomenon occurs during training.
arXiv Detail & Related papers (2021-10-22T12:49:31Z) - Understanding Dimensional Collapse in Contrastive Self-supervised
Learning [57.98014222570084]
We show that non-contrastive methods suffer from a lesser collapse problem of a different nature: dimensional collapse.
Inspired by our theory, we propose a novel contrastive learning method, called DirectCLR, which directly optimize the representation space without relying on a trainable projector.
arXiv Detail & Related papers (2021-10-18T14:22:19Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.