Baby Physical Safety Monitoring in Smart Home Using Action Recognition
System
- URL: http://arxiv.org/abs/2210.12527v2
- Date: Sun, 30 Apr 2023 01:17:01 GMT
- Title: Baby Physical Safety Monitoring in Smart Home Using Action Recognition
System
- Authors: Victor Adewopo, Nelly Elsayed, Kelly Anderson
- Abstract summary: We present a novel framework combining transfer learning techniques with a Conv2D LSTM layer to extract features from the pre-trained I3D model on the Kinetics dataset.
We developed a benchmark dataset and an automated model that uses LSTM convolution with I3D (ConvLSTM-I3D) for recognizing and predicting baby activities in a smart baby room.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Humans are able to intuitively deduce actions that took place between two
states in observations via deductive reasoning. This is because the brain
operates on a bidirectional communication model, which has radically improved
the accuracy of recognition and prediction based on features connected to
previous experiences. During the past decade, deep learning models for action
recognition have significantly improved. However, deep neural networks struggle
with these tasks on a smaller dataset for specific Action Recognition (AR)
tasks. As with most action recognition tasks, the ambiguity of accurately
describing activities in spatial-temporal data is a drawback that can be
overcome by curating suitable datasets, including careful annotations and
preprocessing of video data for analyzing various recognition tasks. In this
study, we present a novel lightweight framework combining transfer learning
techniques with a Conv2D LSTM layer to extract features from the pre-trained
I3D model on the Kinetics dataset for a new AR task (Smart Baby Care) that
requires a smaller dataset and less computational resources. Furthermore, we
developed a benchmark dataset and an automated model that uses LSTM convolution
with I3D (ConvLSTM-I3D) for recognizing and predicting baby activities in a
smart baby room. Finally, we implemented video augmentation to improve model
performance on the smart baby care task. Compared to other benchmark models,
our experimental framework achieved better performance with less computational
resources.
Related papers
- On Machine Learning Approaches for Protein-Ligand Binding Affinity Prediction [2.874893537471256]
This study evaluates the performance of classical tree-based models and advanced neural networks in protein-ligand binding affinity prediction.
We show that combining 2D and 3D model strengths improves active learning outcomes beyond current state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-15T13:06:00Z) - 4D Contrastive Superflows are Dense 3D Representation Learners [62.433137130087445]
We introduce SuperFlow, a novel framework designed to harness consecutive LiDAR-camera pairs for establishing pretraining objectives.
To further boost learning efficiency, we incorporate a plug-and-play view consistency module that enhances alignment of the knowledge distilled from camera views.
arXiv Detail & Related papers (2024-07-08T17:59:54Z) - Predicting Infant Brain Connectivity with Federated Multi-Trajectory
GNNs using Scarce Data [54.55126643084341]
Existing deep learning solutions suffer from three major limitations.
We introduce FedGmTE-Net++, a federated graph-based multi-trajectory evolution network.
Using the power of federation, we aggregate local learnings among diverse hospitals with limited datasets.
arXiv Detail & Related papers (2024-01-01T10:20:01Z) - FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with
Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.
We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC)
Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video
Anomaly Detection [108.57862846523858]
We revisit the self-supervised multi-task learning framework, proposing several updates to the original method.
We modernize the 3D convolutional backbone by introducing multi-head self-attention modules.
In our attempt to further improve the model, we study additional self-supervised learning tasks, such as predicting segmentation maps.
arXiv Detail & Related papers (2022-07-16T19:25:41Z) - 3D Convolutional with Attention for Action Recognition [6.238518976312625]
Current action recognition methods use computationally expensive models for learning-temporal dependencies of the action.
This paper proposes a deep neural network architecture for learning such dependencies consisting of a 3D convolutional layer, fully connected layers and attention layer.
The method first learns spatial features and temporal of actions through 3D-CNN, and then the attention temporal mechanism helps the model to locate attention to essential features.
arXiv Detail & Related papers (2022-06-05T15:12:57Z) - Non-local Graph Convolutional Network for joint Activity Recognition and
Motion Prediction [2.580765958706854]
3D skeleton-based motion prediction and activity recognition are two interwoven tasks in human behaviour analysis.
We propose a new way to combine the advantages of both graph convolutional neural networks and recurrent neural networks for joint human motion prediction and activity recognition.
arXiv Detail & Related papers (2021-08-03T14:07:10Z) - Transformer-Based Behavioral Representation Learning Enables Transfer
Learning for Mobile Sensing in Small Datasets [4.276883061502341]
We provide a neural architecture framework for mobile sensing data that can learn generalizable feature representations from time series.
This architecture combines benefits from CNN and Trans-former architectures to enable better prediction performance.
arXiv Detail & Related papers (2021-07-09T22:26:50Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Self-supervised Human Activity Recognition by Learning to Predict
Cross-Dimensional Motion [16.457778420360537]
We propose the use of self-supervised learning for human activity recognition with smartphone accelerometer data.
First, the representations of unlabeled input signals are learned by training a deep convolutional neural network to predict a segment of accelerometer values.
For this task, we add a number of fully connected layers to the end of the frozen network and train the added layers with labeled accelerometer signals to learn to classify human activities.
arXiv Detail & Related papers (2020-10-21T02:14:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.