Massive Activations in Graph Neural Networks: Decoding Attention for Domain-Dependent Interpretability
- URL: http://arxiv.org/abs/2409.03463v3
- Date: Fri, 07 Mar 2025 15:17:02 GMT
- Title: Massive Activations in Graph Neural Networks: Decoding Attention for Domain-Dependent Interpretability
- Authors: Lorenzo Bini, Marco Sorbi, Stephane Marchand-Maillet,
- Abstract summary: We show the emergence of Massive Activations (MAs) within attention layers in edge-featured Graph Neural Networks (GNNs)<n>Our study assesses various edge-featured attention-based GNN models using benchmark datasets, including ZINC, TOX21, and PROTEINS.
- Score: 0.9499648210774584
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph Neural Networks (GNNs) have become increasingly popular for effectively modeling graph-structured data, and attention mechanisms have been pivotal in enabling these models to capture complex patterns. In our study, we reveal a critical yet underexplored consequence of integrating attention into edge-featured GNNs: the emergence of Massive Activations (MAs) within attention layers. By developing a novel method for detecting MAs on edge features, we show that these extreme activations are not only activation anomalies but encode domain-relevant signals. Our post-hoc interpretability analysis demonstrates that, in molecular graphs, MAs aggregate predominantly on common bond types (e.g., single and double bonds) while sparing more informative ones (e.g., triple bonds). Furthermore, our ablation studies confirm that MAs can serve as natural attribution indicators, reallocating to less informative edges. Our study assesses various edge-featured attention-based GNN models using benchmark datasets, including ZINC, TOX21, and PROTEINS. Key contributions include (1) establishing the direct link between attention mechanisms and MAs generation in edge-featured GNNs, (2) developing a robust definition and detection method for MAs enabling reliable post-hoc interpretability. Overall, our study reveals the complex interplay between attention mechanisms, edge-featured GNNs model, and MAs emergence, providing crucial insights for relating GNNs internals to domain knowledge.
Related papers
- Hallucination Detection in LLMs via Topological Divergence on Attention Graphs [64.74977204942199]
Hallucination, i.e., generating factually incorrect content, remains a critical challenge for large language models.
We introduce TOHA, a TOpology-based HAllucination detector in the RAG setting.
arXiv Detail & Related papers (2025-04-14T10:06:27Z) - Overlap-aware meta-learning attention to enhance hypergraph neural networks for node classification [7.822666400307049]
We propose a novel framework, overlap-aware meta-learning attention for hypergraph neural networks (OMA-HGNN)
First, we introduce a hypergraph attention mechanism that integrates both structural and feature similarities. Specifically, we linearly combine their respective losses with weighted factors for the HGNN model.
Second, we partition nodes into different tasks based on their diverse overlap levels and develop a multi-task Meta-Weight-Net (MWN) to determine the corresponding weighted factors.
Third, we jointly train the internal MWN model with the losses from the external HGNN model and train the external model with the weighted factors
arXiv Detail & Related papers (2025-03-11T01:38:39Z) - Faithful and Accurate Self-Attention Attribution for Message Passing Neural Networks via the Computation Tree Viewpoint [11.459893079664578]
We propose GATT, edge attribution calculation method for self-attention MPNNs based on the computation tree.
Despite its simplicity, we empirically demonstrate the effectiveness of GATT in three aspects of model explanation.
arXiv Detail & Related papers (2024-06-07T03:40:15Z) - Node Classification via Semantic-Structural Attention-Enhanced Graph Convolutional Networks [0.9463895540925061]
We introduce the semantic-structural attention-enhanced graph convolutional network (SSA-GCN)
It not only models the graph structure but also extracts generalized unsupervised features to enhance classification performance.
Our experiments on the Cora and CiteSeer datasets demonstrate the performance improvements achieved by our proposed method.
arXiv Detail & Related papers (2024-03-24T06:28:54Z) - Investigating Out-of-Distribution Generalization of GNNs: An
Architecture Perspective [45.352741792795186]
We show that the graph self-attention mechanism and the decoupled architecture contribute positively to graph OOD generalization.
We develop a novel GNN backbone model, DGAT, designed to harness the robust properties of both graph self-attention mechanism and the decoupled architecture.
arXiv Detail & Related papers (2024-02-13T05:38:45Z) - HGAttack: Transferable Heterogeneous Graph Adversarial Attack [63.35560741500611]
Heterogeneous Graph Neural Networks (HGNNs) are increasingly recognized for their performance in areas like the web and e-commerce.
This paper introduces HGAttack, the first dedicated gray box evasion attack method for heterogeneous graphs.
arXiv Detail & Related papers (2024-01-18T12:47:13Z) - Enhanced LFTSformer: A Novel Long-Term Financial Time Series Prediction Model Using Advanced Feature Engineering and the DS Encoder Informer Architecture [0.8532753451809455]
This study presents a groundbreaking model for forecasting long-term financial time series, termed the Enhanced LFTSformer.
The model distinguishes itself through several significant innovations.
Systematic experimentation on a range of benchmark stock market datasets demonstrates that the Enhanced LFTSformer outperforms traditional machine learning models.
arXiv Detail & Related papers (2023-10-03T08:37:21Z) - Causally-guided Regularization of Graph Attention Improves
Generalizability [69.09877209676266]
We introduce CAR, a general-purpose regularization framework for graph attention networks.
Methodname aligns the attention mechanism with the causal effects of active interventions on graph connectivity.
For social media network-sized graphs, a CAR-guided graph rewiring approach could allow us to combine the scalability of graph convolutional methods with the higher performance of graph attention.
arXiv Detail & Related papers (2022-10-20T01:29:10Z) - MentorGNN: Deriving Curriculum for Pre-Training GNNs [61.97574489259085]
We propose an end-to-end model named MentorGNN that aims to supervise the pre-training process of GNNs across graphs.
We shed new light on the problem of domain adaption on relational data (i.e., graphs) by deriving a natural and interpretable upper bound on the generalization error of the pre-trained GNNs.
arXiv Detail & Related papers (2022-08-21T15:12:08Z) - Beyond the Gates of Euclidean Space: Temporal-Discrimination-Fusions and
Attention-based Graph Neural Network for Human Activity Recognition [5.600003119721707]
Human activity recognition (HAR) through wearable devices has received much interest due to its numerous applications in fitness tracking, wellness screening, and supported living.
Traditional deep learning (DL) has set a state of the art performance for HAR domain.
We propose an approach based on Graph Neural Networks (GNNs) for structuring the input representation and exploiting the relations among the samples.
arXiv Detail & Related papers (2022-06-10T03:04:23Z) - Heterogeneous Graph Neural Networks using Self-supervised Reciprocally
Contrastive Learning [102.9138736545956]
Heterogeneous graph neural network (HGNN) is a very popular technique for the modeling and analysis of heterogeneous graphs.
We develop for the first time a novel and robust heterogeneous graph contrastive learning approach, namely HGCL, which introduces two views on respective guidance of node attributes and graph topologies.
In this new approach, we adopt distinct but most suitable attribute and topology fusion mechanisms in the two views, which are conducive to mining relevant information in attributes and topologies separately.
arXiv Detail & Related papers (2022-04-30T12:57:02Z) - EXPERT: Public Benchmarks for Dynamic Heterogeneous Academic Graphs [5.4744970832051445]
We present a variety of large scale, dynamic heterogeneous academic graphs to test the effectiveness of models developed for graph forecasting tasks.
Our novel datasets cover both context and content information extracted from scientific publications across two communities: Artificial Intelligence (AI) and Nuclear Nonproliferation (NN)
arXiv Detail & Related papers (2022-04-14T19:43:34Z) - How Knowledge Graph and Attention Help? A Quantitative Analysis into
Bag-level Relation Extraction [66.09605613944201]
We quantitatively evaluate the effect of attention and Knowledge Graph on bag-level relation extraction (RE)
We find that (1) higher attention accuracy may lead to worse performance as it may harm the model's ability to extract entity mention features; (2) the performance of attention is largely influenced by various noise distribution patterns; and (3) KG-enhanced attention indeed improves RE performance, while not through enhanced attention but by incorporating entity prior.
arXiv Detail & Related papers (2021-07-26T09:38:28Z) - Distance-aware Molecule Graph Attention Network for Drug-Target Binding
Affinity Prediction [54.93890176891602]
We propose a diStance-aware Molecule graph Attention Network (S-MAN) tailored to drug-target binding affinity prediction.
As a dedicated solution, we first propose a position encoding mechanism to integrate the topological structure and spatial position information into the constructed pocket-ligand graph.
We also propose a novel edge-node hierarchical attentive aggregation structure which has edge-level aggregation and node-level aggregation.
arXiv Detail & Related papers (2020-12-17T17:44:01Z) - Hierarchical Message-Passing Graph Neural Networks [12.207978823927386]
We propose a novel Hierarchical Message-passing Graph Neural Networks framework.
Key idea is generating a hierarchical structure that re-organises all nodes in a flat graph into multi-level super graphs.
We present the first model to implement this framework, termed Hierarchical Community-aware Graph Neural Network (HC-GNN)
arXiv Detail & Related papers (2020-09-08T13:11:07Z) - Deep brain state classification of MEG data [2.9048924265579124]
This paper uses Magnetoencephalography (MEG) data, provided by the Human Connectome Project (HCP), in combination with various deep artificial neural network models to perform brain decoding.
arXiv Detail & Related papers (2020-07-02T05:51:57Z) - Graph Backdoor [53.70971502299977]
We present GTA, the first backdoor attack on graph neural networks (GNNs)
GTA departs in significant ways: it defines triggers as specific subgraphs, including both topological structures and descriptive features.
It can be instantiated for both transductive (e.g., node classification) and inductive (e.g., graph classification) tasks.
arXiv Detail & Related papers (2020-06-21T19:45:30Z) - Multi-View Graph Neural Networks for Molecular Property Prediction [67.54644592806876]
We present Multi-View Graph Neural Network (MV-GNN), a multi-view message passing architecture.
In MV-GNN, we introduce a shared self-attentive readout component and disagreement loss to stabilize the training process.
We further boost the expressive power of MV-GNN by proposing a cross-dependent message passing scheme.
arXiv Detail & Related papers (2020-05-17T04:46:07Z) - Graph Representation Learning via Graphical Mutual Information
Maximization [86.32278001019854]
We propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations.
We develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder.
arXiv Detail & Related papers (2020-02-04T08:33:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.