Let me join you! Real-time F-formation recognition by a socially aware
robot
- URL: http://arxiv.org/abs/2008.10078v1
- Date: Sun, 23 Aug 2020 17:46:08 GMT
- Title: Let me join you! Real-time F-formation recognition by a socially aware
robot
- Authors: Hrishav Bakul Barua, Pradip Pramanick, Chayan Sarkar, Theint Haythi Mg
- Abstract summary: This paper presents a novel architecture to detect social groups in real-time from a continuous image stream of an ego-vision camera.
We detect F-formations in social gatherings such as meetings, discussions, etc. and predict the robot's approach angle if it wants to join the social group.
We also detect outliers, i.e., the persons who are not part of the group under consideration.
- Score: 2.8101673772585745
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel architecture to detect social groups in real-time
from a continuous image stream of an ego-vision camera. F-formation defines
social orientations in space where two or more person tends to communicate in a
social place. Thus, essentially, we detect F-formations in social gatherings
such as meetings, discussions, etc. and predict the robot's approach angle if
it wants to join the social group. Additionally, we also detect outliers, i.e.,
the persons who are not part of the group under consideration. Our proposed
pipeline consists of -- a) a skeletal key points estimator (a total of 17) for
the detected human in the scene, b) a learning model (using a feature vector
based on the skeletal points) using CRF to detect groups of people and outlier
person in a scene, and c) a separate learning model using a multi-class Support
Vector Machine (SVM) to predict the exact F-formation of the group of people in
the current scene and the angle of approach for the viewing robot. The system
is evaluated using two data-sets. The results show that the group and outlier
detection in a scene using our method establishes an accuracy of 91%. We have
made rigorous comparisons of our systems with a state-of-the-art F-formation
detection system and found that it outperforms the state-of-the-art by 29% for
formation detection and 55% for combined detection of the formation and
approach angle.
Related papers
- Understanding Spatio-Temporal Relations in Human-Object Interaction using Pyramid Graph Convolutional Network [2.223052975765005]
We propose a novel Pyramid Graph Convolutional Network (PGCN) to automatically recognize human-object interaction.
The system represents the 2D or 3D spatial relation of human and objects from the detection results in video data as a graph.
We evaluate our model on two challenging datasets in the field of human-object interaction recognition.
arXiv Detail & Related papers (2024-10-10T13:39:17Z) - Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - Real-time Trajectory-based Social Group Detection [22.86110112028644]
We propose a simple and efficient framework for social group detection.
Our approach explores the impact of motion trajectory on social grouping and utilizes a novel, reliable, and fast data-driven method.
Our experiments on the popular JRDBAct dataset reveal noticeable improvements in performance, with relative improvements ranging from 2% to 11%.
arXiv Detail & Related papers (2023-04-12T08:01:43Z) - Self-supervised Social Relation Representation for Human Group Detection [18.38523753680367]
We propose a new two-stage multi-head framework for human group detection.
In the first stage, we propose a human behavior simulator head to learn the social relation feature embedding.
In the second stage, based on the social relation embedding, we develop a self-attention inspired network for human group detection.
arXiv Detail & Related papers (2022-03-08T04:26:07Z) - Domain Adaptive Robotic Gesture Recognition with Unsupervised
Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot.
It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture.
Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z) - Inter-class Discrepancy Alignment for Face Recognition [55.578063356210144]
We propose a unified framework calledInter-class DiscrepancyAlignment(IDA)
IDA-DAO is used to align the similarity scores considering the discrepancy between the images and its neighbors.
IDA-SSE can provide convincing inter-class neighbors by introducing virtual candidate images generated with GAN.
arXiv Detail & Related papers (2021-03-02T08:20:08Z) - TraND: Transferable Neighborhood Discovery for Unsupervised Cross-domain
Gait Recognition [77.77786072373942]
This paper proposes a Transferable Neighborhood Discovery (TraND) framework to bridge the domain gap for unsupervised cross-domain gait recognition.
We design an end-to-end trainable approach to automatically discover the confident neighborhoods of unlabeled samples in the latent space.
Our method achieves state-of-the-art results on two public datasets, i.e., CASIA-B and OU-LP.
arXiv Detail & Related papers (2021-02-09T03:07:07Z) - Diverse Knowledge Distillation for End-to-End Person Search [81.4926655119318]
Person search aims to localize and identify a specific person from a gallery of images.
Recent methods can be categorized into two groups, i.e., two-step and end-to-end approaches.
We propose a simple yet strong end-to-end network with diverse knowledge distillation to break the bottleneck.
arXiv Detail & Related papers (2020-12-21T09:04:27Z) - REFORM: Recognizing F-formations for Social Robots [4.833815605196964]
We introduce REFORM, a data-driven approach for detecting F-formations given human and agent positions and orientations.
We find that REFORM yielded improved accuracy over a state-of-the-art F-formation detection algorithm.
arXiv Detail & Related papers (2020-08-17T23:32:05Z) - Joint Inference of States, Robot Knowledge, and Human (False-)Beliefs [90.20235972293801]
Aiming to understand how human (false-temporal)-belief-a core socio-cognitive ability unify-would affect human interactions with robots, this paper proposes to adopt a graphical model to the representation of object states, robot knowledge, and human (false-)beliefs.
An inference algorithm is derived to fuse individual pg from all robots across multi-views into a joint pg, which affords more effective reasoning inference capability to overcome the errors originated from a single view.
arXiv Detail & Related papers (2020-04-25T23:02:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.