GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal
Transformer
- URL: http://arxiv.org/abs/2108.12630v1
- Date: Sat, 28 Aug 2021 11:24:36 GMT
- Title: GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal
Transformer
- Authors: Shuaicheng Li, Qianggang Cao, Lingbo Liu, Kunlin Yang, Shinan Liu, Jun
Hou and Shuai Yi
- Abstract summary: GroupFormer captures spatial-temporal contextual information jointly to augment the individual and group representations.
The proposed framework outperforms state-of-the-art methods on the Volleyball dataset and Collective Activity dataset.
- Score: 16.988878921451484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Group activity recognition is a crucial yet challenging problem, whose core
lies in fully exploring spatial-temporal interactions among individuals and
generating reasonable group representations. However, previous methods either
model spatial and temporal information separately, or directly aggregate
individual features to form group features. To address these issues, we propose
a novel group activity recognition network termed GroupFormer. It captures
spatial-temporal contextual information jointly to augment the individual and
group representations effectively with a clustered spatial-temporal
transformer. Specifically, our GroupFormer has three appealing advantages: (1)
A tailor-modified Transformer, Clustered Spatial-Temporal Transformer, is
proposed to enhance the individual representation and group representation. (2)
It models the spatial and temporal dependencies integrally and utilizes
decoders to build the bridge between the spatial and temporal information. (3)
A clustered attention mechanism is utilized to dynamically divide individuals
into multiple clusters for better learning activity-aware semantic
representations. Moreover, experimental results show that the proposed
framework outperforms state-of-the-art methods on the Volleyball dataset and
Collective Activity dataset. Code is available at
https://github.com/xueyee/GroupFormer.
Related papers
- Redefining Event Types and Group Evolution in Temporal Data [0.16385815610837165]
In temporal data, the predominant approach for characterizing group evolution has been through the identification of events"
We think of events as archetypes" characterized by a unique combination of quantitative dimensions that we call facet extremities"
We apply our framework to evolving groups from several face-to-face interaction datasets, showing it enables richer, more reliable characterization of group dynamics.
arXiv Detail & Related papers (2024-03-11T14:39:24Z) - Towards Efficient and Effective Deep Clustering with Dynamic Grouping
and Prototype Aggregation [4.550555443103878]
We present a novel end-to-end deep clustering framework with dynamic grouping and prototype aggregation, termed as DigPro.
Specifically, the proposed dynamic grouping extends contrastive learning from instance-level to group-level, which is effective and efficient for timely updating groups.
With an expectation-maximization framework, DigPro simultaneously takes advantage of compact intra-cluster connections, well-separated clusters, and efficient group updating during the self-supervised training.
arXiv Detail & Related papers (2024-01-24T16:45:42Z) - Unified Multi-View Orthonormal Non-Negative Graph Based Clustering
Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework.
We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z) - Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions.
We propose an algorithm that optimize for the worst-off group assignments from a constraint set.
We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z) - Observing a group to infer individual characteristics [1.0152838128195465]
We propose a new observer algorithm that infers, based only on observed movement information, how the local neighborhood aids or hinders agent movement.
Unlike a traditional supervised learning approach, this algorithm is based on physical insights and scaling arguments, and does not rely on training-data.
Data-agnostic approaches like this have relevance to a large class of real-world problems where clean, labeled data is difficult to obtain.
arXiv Detail & Related papers (2021-10-12T09:59:54Z) - Learning Multi-Attention Context Graph for Group-Based Re-Identification [214.84551361855443]
Learning to re-identify or retrieve a group of people across non-overlapped camera systems has important applications in video surveillance.
In this work, we consider employing context information for identifying groups of people, i.e., group re-id.
We propose a novel unified framework based on graph neural networks to simultaneously address the group-based re-id tasks.
arXiv Detail & Related papers (2021-04-29T09:57:47Z) - LieTransformer: Equivariant self-attention for Lie Groups [49.9625160479096]
Group equivariant neural networks are used as building blocks of group invariant neural networks.
We extend the scope of the literature to self-attention, that is emerging as a prominent building block of deep learning models.
We propose the LieTransformer, an architecture composed of LieSelfAttention layers that are equivariant to arbitrary Lie groups and their discrete subgroups.
arXiv Detail & Related papers (2020-12-20T11:02:49Z) - CoADNet: Collaborative Aggregation-and-Distribution Networks for
Co-Salient Object Detection [91.91911418421086]
Co-Salient Object Detection (CoSOD) aims at discovering salient objects that repeatedly appear in a given query group containing two or more relevant images.
One challenging issue is how to effectively capture co-saliency cues by modeling and exploiting inter-image relationships.
We present an end-to-end collaborative aggregation-and-distribution network (CoADNet) to capture both salient and repetitive visual patterns from multiple images.
arXiv Detail & Related papers (2020-11-10T04:28:11Z) - From Time Series to Euclidean Spaces: On Spatial Transformations for
Temporal Clustering [5.220940151628734]
We show that neither traditional clustering methods, time series specific or even deep learning-based alternatives generalise well when both varying sampling rates and high dimensionality are present in the input data.
We propose a novel approach to temporal clustering, in which we transform the input time series into a distance-based projected representation.
arXiv Detail & Related papers (2020-10-02T09:08:16Z) - Overcoming Data Sparsity in Group Recommendation [52.00998276970403]
Group recommender systems should be able to accurately learn not only users' personal preferences but also preference aggregation strategy.
In this paper, we take Bipartite Graphding Model (BGEM), the self-attention mechanism and Graph Convolutional Networks (GCNs) as basic building blocks to learn group and user representations in a unified way.
arXiv Detail & Related papers (2020-10-02T07:11:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.