A Unified and Biologically-Plausible Relational Graph Representation of
Vision Transformers
- URL: http://arxiv.org/abs/2206.11073v1
- Date: Fri, 20 May 2022 05:53:23 GMT
- Title: A Unified and Biologically-Plausible Relational Graph Representation of
Vision Transformers
- Authors: Yuzhong Chen, Yu Du, Zhenxiang Xiao, Lin Zhao, Lu Zhang, David
Weizhong Liu, Dajiang Zhu, Tuo Zhang, Xintao Hu, Tianming Liu, Xi Jiang
- Abstract summary: Vision transformer (ViT) and its variants have achieved remarkable successes in various visual tasks.
We propose a unified and biologically-plausible relational graph representation of ViT models.
Our work provides a novel unified and biologically-plausible paradigm for more interpretable and effective representation of ViT ANNs.
- Score: 11.857392812189872
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vision transformer (ViT) and its variants have achieved remarkable successes
in various visual tasks. The key characteristic of these ViT models is to adopt
different aggregation strategies of spatial patch information within the
artificial neural networks (ANNs). However, there is still a key lack of
unified representation of different ViT architectures for systematic
understanding and assessment of model representation performance. Moreover, how
those well-performing ViT ANNs are similar to real biological neural networks
(BNNs) is largely unexplored. To answer these fundamental questions, we, for
the first time, propose a unified and biologically-plausible relational graph
representation of ViT models. Specifically, the proposed relational graph
representation consists of two key sub-graphs: aggregation graph and affine
graph. The former one considers ViT tokens as nodes and describes their spatial
interaction, while the latter one regards network channels as nodes and
reflects the information communication between channels. Using this unified
relational graph representation, we found that: a) a sweet spot of the
aggregation graph leads to ViTs with significantly improved predictive
performance; b) the graph measures of clustering coefficient and average path
length are two effective indicators of model prediction performance, especially
when applying on the datasets with small samples; c) our findings are
consistent across various ViT architectures and multiple datasets; d) the
proposed relational graph representation of ViT has high similarity with real
BNNs derived from brain science data. Overall, our work provides a novel
unified and biologically-plausible paradigm for more interpretable and
effective representation of ViT ANNs.
Related papers
- Scalable Weibull Graph Attention Autoencoder for Modeling Document Networks [50.42343781348247]
We develop a graph Poisson factor analysis (GPFA) which provides analytic conditional posteriors to improve the inference accuracy.
We also extend GPFA to a multi-stochastic-layer version named graph Poisson gamma belief network (GPGBN) to capture the hierarchical document relationships at multiple semantic levels.
Our models can extract high-quality hierarchical latent document representations and achieve promising performance on various graph analytic tasks.
arXiv Detail & Related papers (2024-10-13T02:22:14Z) - BHGNN-RT: Network embedding for directed heterogeneous graphs [8.7024326813104]
We propose an embedding method, a bidirectional heterogeneous graph neural network with random teleport (BHGNN-RT), for directed heterogeneous graphs.
Extensive experiments on various datasets were conducted to verify the efficacy and efficiency of BHGNN-RT.
BHGNN-RT achieves state-of-the-art performance, outperforming the benchmark methods in both node classification and unsupervised clustering tasks.
arXiv Detail & Related papers (2023-11-24T10:56:09Z) - DURENDAL: Graph deep learning framework for temporal heterogeneous
networks [0.5156484100374057]
Temporal heterogeneous networks (THNs) are evolving networks that characterize many real-world applications.
We propose DURENDAL, a graph deep learning framework for THNs.
arXiv Detail & Related papers (2023-09-30T10:46:01Z) - MTS2Graph: Interpretable Multivariate Time Series Classification with
Temporal Evolving Graphs [1.1756822700775666]
We introduce a new framework for interpreting time series data by extracting and clustering the input representative patterns.
We run experiments on eight datasets of the UCR/UEA archive, along with HAR and PAM datasets.
arXiv Detail & Related papers (2023-06-06T16:24:27Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - TCL: Transformer-based Dynamic Graph Modelling via Contrastive Learning [87.38675639186405]
We propose a novel graph neural network approach, called TCL, which deals with the dynamically-evolving graph in a continuous-time fashion.
To the best of our knowledge, this is the first attempt to apply contrastive learning to representation learning on dynamic graphs.
arXiv Detail & Related papers (2021-05-17T15:33:25Z) - Vision Transformers are Robust Learners [65.91359312429147]
We study the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples.
We present analyses that provide both quantitative and qualitative indications to explain why ViTs are indeed more robust learners.
arXiv Detail & Related papers (2021-05-17T02:39:22Z) - Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations.
Several recent studies attribute this performance deterioration to the over-smoothing issue.
We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z) - Tensor Graph Convolutional Networks for Multi-relational and Robust
Learning [74.05478502080658]
This paper introduces a tensor-graph convolutional network (TGCN) for scalable semi-supervised learning (SSL) from data associated with a collection of graphs, that are represented by a tensor.
The proposed architecture achieves markedly improved performance relative to standard GCNs, copes with state-of-the-art adversarial attacks, and leads to remarkable SSL performance over protein-to-protein interaction networks.
arXiv Detail & Related papers (2020-03-15T02:33:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.