Design and Analysis of Efficient Attention in Transformers for Social Group Activity Recognition
- URL: http://arxiv.org/abs/2404.09964v1
- Date: Mon, 15 Apr 2024 17:40:23 GMT
- Title: Design and Analysis of Efficient Attention in Transformers for Social Group Activity Recognition
- Authors: Masato Tamura,
- Abstract summary: We propose leveraging attention modules in transformers to generate social group features.
Multiple embeddings are used to aggregate features for a social group, each of which is assigned to a group member without duplication.
The proposed method achieves state-of-the-art performance and verify that the proposed attention designs are highly effective on social group activity recognition.
- Score: 3.75292409381511
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Social group activity recognition is a challenging task extended from group activity recognition, where social groups must be recognized with their activities and group members. Existing methods tackle this task by leveraging region features of individuals following existing group activity recognition methods. However, the effectiveness of region features is susceptible to person localization and variable semantics of individual actions. To overcome these issues, we propose leveraging attention modules in transformers to generate social group features. In this method, multiple embeddings are used to aggregate features for a social group, each of which is assigned to a group member without duplication. Due to this non-duplicated assignment, the number of embeddings must be significant to avoid missing group members and thus renders attention in transformers ineffective. To find optimal attention designs with a large number of embeddings, we explore several design choices of queries for feature aggregation and self-attention modules in transformer decoders. Extensive experimental results show that the proposed method achieves state-of-the-art performance and verify that the proposed attention designs are highly effective on social group activity recognition.
Related papers
- The Research of Group Re-identification from Multiple Cameras [0.4955551943523977]
Group re-identification is very challenging since it is not only interfered by view-point and human pose variations in the traditional re-identification tasks.
This paper introduces a novel approach which leverages the multi-granularity information inside groups to facilitate group re-identification.
arXiv Detail & Related papers (2024-07-19T18:28:13Z) - AdaFPP: Adapt-Focused Bi-Propagating Prototype Learning for Panoramic Activity Recognition [51.24321348668037]
Panoramic Activity Recognition (PAR) aims to identify multi-granularity behaviors performed by multiple persons in panoramic scenes.
Previous methods rely on manually annotated detection boxes in training and inference, hindering further practical deployment.
We propose a novel Adapt-Focused bi-Propagating Prototype learning (AdaFPP) framework to jointly recognize individual, group, and global activities in panoramic activity scenes.
arXiv Detail & Related papers (2024-05-04T01:53:22Z) - Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM [55.93697196726016]
We propose a simple yet effective crowd counting method by utilizing the Segment-Everything-Everywhere Model (SEEM)
We show that SEEM's performance in dense crowd scenes is limited, primarily due to the omission of many persons in high-density areas.
Our proposed method achieves the best unsupervised performance in crowd counting, while also being comparable to some supervised methods.
arXiv Detail & Related papers (2024-02-27T13:55:17Z) - Ranking-based Group Identification via Factorized Attention on Social
Tripartite Graph [68.08590487960475]
We propose a novel GNN-based framework named Contextualized Factorized Attention for Group identification (CFAG)
We devise tripartite graph convolution layers to aggregate information from different types of neighborhoods among users, groups, and items.
To cope with the data sparsity issue, we devise a novel propagation augmentation layer, which is based on our proposed factorized attention mechanism.
arXiv Detail & Related papers (2022-11-02T01:42:20Z) - Attentive pooling for Group Activity Recognition [23.241686027269928]
In group activity recognition, hierarchical framework is widely adopted to represent the relationships between individuals and their corresponding group.
We propose a new contextual pooling scheme, named attentive pooling, which enables the weighted information transition from individual actions to group activity.
arXiv Detail & Related papers (2022-08-31T13:26:39Z) - Hunting Group Clues with Transformers for Social Group Activity
Recognition [3.1061678033205635]
Social group activity recognition requires recognizing multiple sub-group activities and identifying group members.
Most existing methods tackle both tasks by refining region features and then summarizing them into activity features.
We propose to leverage attention modules in transformers to generate effective social group features.
Our method is designed in such a way that the attention modules identify and then aggregate features relevant to social group activities.
arXiv Detail & Related papers (2022-07-12T01:46:46Z) - Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions.
We propose an algorithm that optimize for the worst-off group assignments from a constraint set.
We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z) - GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal
Transformer [16.988878921451484]
GroupFormer captures spatial-temporal contextual information jointly to augment the individual and group representations.
The proposed framework outperforms state-of-the-art methods on the Volleyball dataset and Collective Activity dataset.
arXiv Detail & Related papers (2021-08-28T11:24:36Z) - Overcoming Data Sparsity in Group Recommendation [52.00998276970403]
Group recommender systems should be able to accurately learn not only users' personal preferences but also preference aggregation strategy.
In this paper, we take Bipartite Graphding Model (BGEM), the self-attention mechanism and Graph Convolutional Networks (GCNs) as basic building blocks to learn group and user representations in a unified way.
arXiv Detail & Related papers (2020-10-02T07:11:19Z) - Randomized Entity-wise Factorization for Multi-Agent Reinforcement
Learning [59.62721526353915]
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities.
Our method aims to leverage these commonalities by asking the question: What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?''
arXiv Detail & Related papers (2020-06-07T18:28:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.