Related papers: Gramformer: Learning Crowd Counting via Graph-Modulated Transformer

Gramformer: Learning Crowd Counting via Graph-Modulated Transformer

URL: http://arxiv.org/abs/2401.03870v1
Date: Mon, 8 Jan 2024 13:01:54 GMT
Title: Gramformer: Learning Crowd Counting via Graph-Modulated Transformer
Authors: Hui Lin and Zhiheng Ma and Xiaopeng Hong and Qinnan Shangguan and Deyu Meng
Abstract summary: Gramformer is a graph-modulated transformer to enhance the network by adjusting the attention and input node features respectively. A feature-based encoding is proposed to discover the centrality positions or importance of nodes. Experiments on four challenging crowd counting datasets have validated the competitiveness of the proposed method.
Score: 68.26599222077466
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transformer has been popular in recent crowd counting work since it breaks the limited receptive field of traditional CNNs. However, since crowd images always contain a large number of similar patches, the self-attention mechanism in Transformer tends to find a homogenized solution where the attention maps of almost all patches are identical. In this paper, we address this problem by proposing Gramformer: a graph-modulated transformer to enhance the network by adjusting the attention and input node features respectively on the basis of two different types of graphs. Firstly, an attention graph is proposed to diverse attention maps to attend to complementary information. The graph is building upon the dissimilarities between patches, modulating the attention in an anti-similarity fashion. Secondly, a feature-based centrality encoding is proposed to discover the centrality positions or importance of nodes. We encode them with a proposed centrality indices scheme to modulate the node features and similarity relationships. Extensive experiments on four challenging crowd counting datasets have validated the competitiveness of the proposed method. Code is available at {https://github.com/LoraLinH/Gramformer}.

Related papers

Towards Mechanistic Interpretability of Graph Transformers via Attention Graphs [16.249474010042736]
We introduce Attention Graphs, a new tool for mechanistic interpretability of Graph Neural Networks (GNNs) and Graph Transformers. Attention Graphs aggregate attention matrices across Transformer layers and heads to describe how information flows among input nodes.
arXiv Detail & Related papers (2025-02-17T22:35:16Z)
InstructG2I: Synthesizing Images from Multimodal Attributed Graphs [50.852150521561676]
We propose a graph context-conditioned diffusion model called InstructG2I. InstructG2I first exploits the graph structure and multimodal information to conduct informative neighbor sampling. A Graph-QFormer encoder adaptively encodes the graph nodes into an auxiliary set of graph prompts to guide the denoising process.
arXiv Detail & Related papers (2024-10-09T17:56:15Z)
Graph as Point Set [31.448841287258116]
This paper introduces a novel graph-to-set conversion method that transforms interconnected nodes into a set of independent points. It enables using set encoders to learn from graphs, thereby significantly expanding the design space of Graph Neural Networks. To demonstrate the effectiveness of our approach, we introduce Point Set Transformer (PST), a transformer architecture that accepts a point set converted from a graph as input.
arXiv Detail & Related papers (2024-05-05T02:29:41Z)
GvT: A Graph-based Vision Transformer with Talking-Heads Utilizing Sparsity, Trained from Scratch on Small Datasets [1.1586742546971471]
We propose a Graph-based Vision Transformer (GvT) that utilizes graph convolutional projection and graph-pooling. GvT produces comparable or superior outcomes to deep convolutional networks without pre-training on large datasets.
arXiv Detail & Related papers (2024-04-07T11:48:07Z)
NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes. The efficient computation is enabled by a kernerlized Gumbel-Softmax operator. Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z)
Discrete Graph Auto-Encoder [52.50288418639075]
We introduce a new framework named Discrete Graph Auto-Encoder (DGAE) We first use a permutation-equivariant auto-encoder to convert graphs into sets of discrete latent node representations. In the second step, we sort the sets of discrete latent representations and learn their distribution with a specifically designed auto-regressive model.
arXiv Detail & Related papers (2023-06-13T12:40:39Z)
Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision. A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive. We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z)
Graph Reasoning Transformer for Image Parsing [67.76633142645284]
We propose a novel Graph Reasoning Transformer (GReaT) for image parsing to enable image patches to interact following a relation reasoning pattern. Compared to the conventional transformer, GReaT has higher interaction efficiency and a more purposeful interaction pattern. Results show that GReaT achieves consistent performance gains with slight computational overheads on the state-of-the-art transformer baselines.
arXiv Detail & Related papers (2022-09-20T08:21:37Z)
Gransformer: Transformer-based Graph Generation [14.161975556325796]
Gransformer is an algorithm based on Transformer for generating graphs. We modify the Transformer encoder to exploit the structural information of the given graph. We also introduce a graph-based familiarity measure between node pairs.
arXiv Detail & Related papers (2022-03-25T14:05:12Z)
A Generalization of Transformer Networks to Graphs [5.736353542430439]
We introduce a graph transformer with four new properties compared to the standard model. The architecture is extended to edge feature representation, which can be critical to tasks s.a. chemistry (bond type) or link prediction (entity relationship in knowledge graphs)
arXiv Detail & Related papers (2020-12-17T16:11:47Z)
AEGCN: An Autoencoder-Constrained Graph Convolutional Network [5.023274927781062]
We propose a novel neural network architecture, called autoencoder-constrained graph convolutional network. The core of this model is a convolutional network operating directly on graphs, whose hidden layers are constrained by an autoencoder. We show that adding autoencoder constraints significantly improves the performance of graph convolutional networks.
arXiv Detail & Related papers (2020-07-03T16:42:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.