Future Aspects in Human Action Recognition: Exploring Emerging Techniques and Ethical Influences
- URL: http://arxiv.org/abs/2412.12990v2
- Date: Fri, 20 Dec 2024 08:19:03 GMT
- Title: Future Aspects in Human Action Recognition: Exploring Emerging Techniques and Ethical Influences
- Authors: Antonios Gasteratos, Stavros N. Moutsis, Konstantinos A. Tsintotas, Yiannis Aloimonos,
- Abstract summary: Visual-based human action recognition can be found in various application fields.
Next-generation hardware sensors will guide the robotics community in tackling the problem.
Human factors' involvement raises ethical issues for the research community.
- Score: 14.065014861242137
- License:
- Abstract: Visual-based human action recognition can be found in various application fields, e.g., surveillance systems, sports analytics, medical assistive technologies, or human-robot interaction frameworks, and it concerns the identification and classification of individuals' activities within a video. Since actions typically occur over a sequence of consecutive images, it is particularly challenging due to the inclusion of temporal analysis, which introduces an extra layer of complexity. However, although multiple approaches try to handle temporal analysis, there are still difficulties because of their computational cost and lack of adaptability. Therefore, different types of vision data, containing transition information between consecutive images, provided by next-generation hardware sensors will guide the robotics community in tackling the problem of human action recognition. On the other hand, while there is a plethora of still-image datasets, that researchers can adopt to train new artificial intelligence models, videos representing human activities are of limited capabilities, e.g., small and unbalanced datasets or selected without control from multiple sources. To this end, generating new and realistic synthetic videos is possible since labeling is performed throughout the data creation process, while reinforcement learning techniques can permit the avoidance of considerable dataset dependence. At the same time, human factors' involvement raises ethical issues for the research community, as doubts and concerns about new technologies already exist.
Related papers
- Generative Technology for Human Emotion Recognition: A Scope Review [11.578408396744237]
This survey aims to bridge the gaps in the existing literature by conducting a comprehensive analysis of over 320 research papers until June 2024.
It will introduce the mathematical principles of different generative models and the commonly used datasets.
It will provide an in-depth analysis of how generative techniques address emotion recognition based on different modalities.
arXiv Detail & Related papers (2024-07-04T05:22:55Z) - Data Augmentation in Human-Centric Vision [54.97327269866757]
This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks.
It delves into a wide range of research areas including person ReID, human parsing, human pose estimation, and pedestrian detection.
Our work categorizes data augmentation methods into two main types: data generation and data perturbation.
arXiv Detail & Related papers (2024-03-13T16:05:18Z) - Can AI Serve as a Substitute for Human Subjects in Software Engineering
Research? [24.39463126056733]
This vision paper proposes a novel approach to qualitative data collection in software engineering research by harnessing the capabilities of artificial intelligence (AI)
We explore the potential of AI-generated synthetic text as an alternative source of qualitative data.
We discuss the prospective development of new foundation models aimed at emulating human behavior in observational studies and user evaluations.
arXiv Detail & Related papers (2023-11-18T14:05:52Z) - Learning Human Action Recognition Representations Without Real Humans [66.61527869763819]
We present a benchmark that leverages real-world videos with humans removed and synthetic data containing virtual humans to pre-train a model.
We then evaluate the transferability of the representation learned on this data to a diverse set of downstream action recognition benchmarks.
Our approach outperforms previous baselines by up to 5%.
arXiv Detail & Related papers (2023-11-10T18:38:14Z) - Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances [76.34037366117234]
We introduce a new dataset called Robot Control Gestures (RoCoG-v2)
The dataset is composed of both real and synthetic videos from seven gesture classes.
We present results using state-of-the-art action recognition and domain adaptation algorithms.
arXiv Detail & Related papers (2023-03-17T23:23:55Z) - Adversarial Attention for Human Motion Synthesis [3.9378507882929563]
We present a novel method for controllable human motion synthesis by applying attention-based probabilistic deep adversarial models with end-to-end training.
We show that we can generate synthetic human motion over both short- and long-time horizons through the use of adversarial attention.
arXiv Detail & Related papers (2022-04-25T16:12:42Z) - Overcoming the Domain Gap in Neural Action Representations [60.47807856873544]
3D pose data can now be reliably extracted from multi-view video sequences without manual intervention.
We propose to use it to guide the encoding of neural action representations together with a set of neural and behavioral augmentations.
To reduce the domain gap, during training, we swap neural and behavioral data across animals that seem to be performing similar actions.
arXiv Detail & Related papers (2021-12-02T12:45:46Z) - What Matters in Learning from Offline Human Demonstrations for Robot
Manipulation [64.43440450794495]
We conduct an extensive study of six offline learning algorithms for robot manipulation.
Our study analyzes the most critical challenges when learning from offline human data.
We highlight opportunities for learning from human datasets.
arXiv Detail & Related papers (2021-08-06T20:48:30Z) - Preserving Privacy in Human-Motion Affect Recognition [4.753703852165805]
This work evaluates the effectiveness of existing methods at recognising emotions using both 3D temporal joint signals and manually extracted features.
We propose a cross-subject transfer learning technique for training a multi-encoder autoencoder deep neural network to learn disentangled latent representations of human motion features.
arXiv Detail & Related papers (2021-05-09T15:26:21Z) - Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works.
However, learning a model that captures the dynamics of complex skills represents a major challenge.
We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.