Joint Learning of Social Groups, Individuals Action and Sub-group
Activities in Videos
- URL: http://arxiv.org/abs/2007.02632v2
- Date: Tue, 28 Jul 2020 00:57:21 GMT
- Title: Joint Learning of Social Groups, Individuals Action and Sub-group
Activities in Videos
- Authors: Mahsa Ehsanpour, Alireza Abedin, Fatemeh Saleh, Javen Shi, Ian Reid,
Hamid Rezatofighi
- Abstract summary: We propose an end-to-end trainable framework for the social task.
Our main contributions are: i) we propose an end-to-end trainable framework for the social task; ii) our proposed method sets the state-of-the-art results on two widely adopted benchmarks for the traditional group recognition activity task.
- Score: 23.15064911470468
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The state-of-the art solutions for human activity understanding from a video
stream formulate the task as a spatio-temporal problem which requires joint
localization of all individuals in the scene and classification of their
actions or group activity over time. Who is interacting with whom, e.g. not
everyone in a queue is interacting with each other, is often not predicted.
There are scenarios where people are best to be split into sub-groups, which we
call social groups, and each social group may be engaged in a different social
activity. In this paper, we solve the problem of simultaneously grouping people
by their social interactions, predicting their individual actions and the
social activity of each social group, which we call the social task. Our main
contributions are: i) we propose an end-to-end trainable framework for the
social task; ii) our proposed method also sets the state-of-the-art results on
two widely adopted benchmarks for the traditional group activity recognition
task (assuming individuals of the scene form a single group and predicting a
single group activity label for the scene); iii) we introduce new annotations
on an existing group activity dataset, re-purposing it for the social task.
Related papers
- Design and Analysis of Efficient Attention in Transformers for Social Group Activity Recognition [3.75292409381511]
We propose leveraging attention modules in transformers to generate social group features.
Multiple embeddings are used to aggregate features for a social group, each of which is assigned to a group member without duplication.
The proposed method achieves state-of-the-art performance and verify that the proposed attention designs are highly effective on social group activity recognition.
arXiv Detail & Related papers (2024-04-15T17:40:23Z) - SocialBench: Sociality Evaluation of Role-Playing Conversational Agents [85.6641890712617]
Large language models (LLMs) have advanced the development of various AI conversational agents.
SocialBench is the first benchmark designed to evaluate the sociality of role-playing conversational agents at both individual and group levels.
We find that agents excelling in individual level does not imply their proficiency in group level.
arXiv Detail & Related papers (2024-03-20T15:38:36Z) - SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents [107.4138224020773]
We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and humans.
In our environment, agents role-play and interact under a wide variety of scenarios; they coordinate, collaborate, exchange, and compete with each other to achieve complex social goals.
We find that GPT-4 achieves a significantly lower goal completion rate than humans and struggles to exhibit social commonsense reasoning and strategic communication skills.
arXiv Detail & Related papers (2023-10-18T02:27:01Z) - Group Activity Recognition in Computer Vision: A Comprehensive Review,
Challenges, and Future Perspectives [0.0]
Group activity recognition is a hot topic in computer vision.
Recognizing activities through group relationships plays a vital role in group activity recognition.
This work examines the progress in technology for recognizing group activities.
arXiv Detail & Related papers (2023-07-25T14:44:41Z) - Adaptive Coordination in Social Embodied Rearrangement [49.35582108902819]
We study zero-shot coordination (ZSC) in this task, where an agent collaborates with a new partner, emulating a scenario where a robot collaborates with a new human partner.
We propose Behavior Diversity Play (BDP), a novel ZSC approach that encourages diversity through a discriminability objective.
Our results demonstrate that BDP learns adaptive agents that can tackle visual coordination, and zero-shot generalize to new partners in unseen environments, achieving 35% higher success and 32% higher efficiency compared to baselines.
arXiv Detail & Related papers (2023-05-31T18:05:51Z) - Hunting Group Clues with Transformers for Social Group Activity
Recognition [3.1061678033205635]
Social group activity recognition requires recognizing multiple sub-group activities and identifying group members.
Most existing methods tackle both tasks by refining region features and then summarizing them into activity features.
We propose to leverage attention modules in transformers to generate effective social group features.
Our method is designed in such a way that the attention modules identify and then aggregate features relevant to social group activities.
arXiv Detail & Related papers (2022-07-12T01:46:46Z) - Self-supervised Social Relation Representation for Human Group Detection [18.38523753680367]
We propose a new two-stage multi-head framework for human group detection.
In the first stage, we propose a human behavior simulator head to learn the social relation feature embedding.
In the second stage, based on the social relation embedding, we develop a self-attention inspired network for human group detection.
arXiv Detail & Related papers (2022-03-08T04:26:07Z) - Detecting socially interacting groups using f-formation: A survey of
taxonomy, methods, datasets, applications, challenges, and future research
directions [3.995408039775796]
Social behavior is one of the most sought-after qualities that a robot can possess.
To possess such a quality, a robot needs to determine the formation of the group and then determine a position for itself.
We put forward a novel holistic survey framework combining all the possible concerns and modules relevant to this problem.
arXiv Detail & Related papers (2021-08-13T11:51:17Z) - JRDB-Act: A Large-scale Multi-modal Dataset for Spatio-temporal Action,
Social Group and Activity Detection [54.696819174421584]
We introduce JRDB-Act, a multi-modal dataset that reflects a real distribution of human daily life actions in a university campus environment.
JRDB-Act has been densely annotated with atomic actions, comprises over 2.8M action labels.
JRDB-Act comes with social group identification annotations conducive to the task of grouping individuals based on their interactions in the scene.
arXiv Detail & Related papers (2021-06-16T14:43:46Z) - Learning Multi-Attention Context Graph for Group-Based Re-Identification [214.84551361855443]
Learning to re-identify or retrieve a group of people across non-overlapped camera systems has important applications in video surveillance.
In this work, we consider employing context information for identifying groups of people, i.e., group re-id.
We propose a novel unified framework based on graph neural networks to simultaneously address the group-based re-id tasks.
arXiv Detail & Related papers (2021-04-29T09:57:47Z) - PHASE: PHysically-grounded Abstract Social Events for Machine Social
Perception [50.551003004553806]
We create a dataset of physically-grounded abstract social events, PHASE, that resemble a wide range of real-life social interactions.
Phase is validated with human experiments demonstrating that humans perceive rich interactions in the social events.
As a baseline model, we introduce a Bayesian inverse planning approach, SIMPLE, which outperforms state-of-the-art feed-forward neural networks.
arXiv Detail & Related papers (2021-03-02T18:44:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.