Feature Pyramid biLSTM: Using Smartphone Sensors for Transportation Mode
Detection
- URL: http://arxiv.org/abs/2310.11087v1
- Date: Tue, 17 Oct 2023 09:13:10 GMT
- Title: Feature Pyramid biLSTM: Using Smartphone Sensors for Transportation Mode
Detection
- Authors: Qinrui Tang, Hao Cheng
- Abstract summary: We propose a novel end-to-end approach to explore a reduced amount of sensory data collected from a smartphone.
Our approach, called Feature Pyramid biLSTM (FPbiLSTM), is characterized by its ability to reduce the number of sensors required and processing demands.
FPbiLSTM extends an existing CNN biLSTM model with the Feature Pyramid Network, leveraging the advantages of both shallow layer richness and deeper layer feature resilience.
- Score: 5.182070755214674
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The widespread utilization of smartphones has provided extensive availability
to Inertial Measurement Units, providing a wide range of sensory data that can
be advantageous for the detection of transportation modes. The objective of
this study is to propose a novel end-to-end approach to effectively explore a
reduced amount of sensory data collected from a smartphone to achieve accurate
mode detection in common daily traveling activities. Our approach, called
Feature Pyramid biLSTM (FPbiLSTM), is characterized by its ability to reduce
the number of sensors required and processing demands, resulting in a more
efficient modeling process without sacrificing the quality of the outcomes than
the other current models. FPbiLSTM extends an existing CNN biLSTM model with
the Feature Pyramid Network, leveraging the advantages of both shallow layer
richness and deeper layer feature resilience for capturing temporal moving
patterns in various transportation modes. It exhibits an excellent performance
by employing the data collected from only three out of seven sensors, i.e.
accelerometers, gyroscopes, and magnetometers, in the 2018 Sussex-Huawei
Locomotion (SHL) challenge dataset, attaining a noteworthy accuracy of 95.1%
and an F1-score of 94.7% in detecting eight different transportation modes.
Related papers
- YOLO11-CR: a Lightweight Convolution-and-Attention Framework for Accurate Fatigue Driving Detection [0.0]
This paper introduces YOLO11-CR, a lightweight and efficient object detection model tailored for real-time fatigue monitoring.<n>YOLO11-CR introduces two key modules: the Convolution-and-Attention Fusion Module (CAFM) and the Rectangular Module (RCM)<n>Experiments on the DSM dataset demonstrated that YOLO11-CR achieves a precision of 87.17%, recall of 83.86%, mAP@50 of 88.09%, and mAP@50-95 of 55.93%.
arXiv Detail & Related papers (2025-08-16T07:19:04Z) - XTransfer: Modality-Agnostic Few-Shot Model Transfer for Human Sensing at the Edge [45.430391851892274]
XTransfer is a first-of-its-kind method enabling modality-agnostic, few-shot model transfer with resource-efficient design.<n>It achieves state-of-the-art performance while significantly reducing the costs of sensor data collection, model training, and edge deployment.
arXiv Detail & Related papers (2025-06-28T02:14:43Z) - SETransformer: A Hybrid Attention-Based Architecture for Robust Human Activity Recognition [7.291558599547268]
Human Activity Recognition (HAR) using wearable sensor data has become a central task in mobile computing, healthcare, and human-computer interaction.<n>We propose SETransformer, a hybrid deep neural architecture that combines Transformer-based temporal modeling with channel-wise squeeze-and-excitation (SE) attention and a learnable temporal attention pooling mechanism.<n>We evaluate SETransformer on the WISDM dataset and demonstrate that it significantly outperforms conventional models including LSTM, GRU, BiLSTM, and CNN baselines.
arXiv Detail & Related papers (2025-05-25T23:39:34Z) - Resource-Efficient Beam Prediction in mmWave Communications with Multimodal Realistic Simulation Framework [57.994965436344195]
Beamforming is a key technology in millimeter-wave (mmWave) communications that improves signal transmission by optimizing directionality and intensity.
multimodal sensing-aided beam prediction has gained significant attention, using various sensing data to predict user locations or network conditions.
Despite its promising potential, the adoption of multimodal sensing-aided beam prediction is hindered by high computational complexity, high costs, and limited datasets.
arXiv Detail & Related papers (2025-04-07T15:38:25Z) - Exploring FMCW Radars and Feature Maps for Activity Recognition: A Benchmark Study [2.251010251400407]
This study introduces a Frequency-Modulated Continuous Wave radar-based framework for human activity recognition.
Unlike conventional approaches that process feature maps as images, this study feeds multi-dimensional feature maps as data vectors.
The ConvLSTM model outperformed conventional machine learning and deep learning models, achieving an accuracy of 90.51%.
arXiv Detail & Related papers (2025-03-07T17:53:29Z) - CNN Autoencoders for Hierarchical Feature Extraction and Fusion in Multi-sensor Human Activity Recognition [0.0]
We introduce a Hierarchically Unsupervised Fusion model designed to extract, and fuse features from IMU sensors data.
The tuned model is applied to the UCI-HAR, DaLiAc, and Parkinson's disease gait da-tasets.
arXiv Detail & Related papers (2025-02-06T20:36:41Z) - Scaling Wearable Foundation Models [54.93979158708164]
We investigate the scaling properties of sensor foundation models across compute, data, and model size.
Using a dataset of up to 40 million hours of in-situ heart rate, heart rate variability, electrodermal activity, accelerometer, skin temperature, and altimeter per-minute data from over 165,000 people, we create LSM.
Our results establish the scaling laws of LSM for tasks such as imputation, extrapolation, both across time and sensor modalities.
arXiv Detail & Related papers (2024-10-17T15:08:21Z) - An LSTM Feature Imitation Network for Hand Movement Recognition from sEMG Signals [2.632402517354116]
We propose utilizing a feature-imitating network (FIN) for closed-form temporal feature learning over a 300ms signal window on Ninapro DB2.
We then explore transfer learning capabilities by applying the pre-trained LSTM-FIN for tuning to a downstream hand movement recognition task.
arXiv Detail & Related papers (2024-05-23T21:45:15Z) - Self-Supervised Multimodal Fusion Transformer for Passive Activity
Recognition [2.35066982314539]
Wi-Fi signals provide significant opportunities for human sensing and activity recognition in fields such as healthcare.
Current systems do not effectively exploit the information acquired through multiple sensors to recognise the different activities.
We propose the Fusion Transformer, an attention-based model for multimodal and multi-sensor fusion.
arXiv Detail & Related papers (2022-08-15T15:38:10Z) - Inertial Hallucinations -- When Wearable Inertial Devices Start Seeing
Things [82.15959827765325]
We propose a novel approach to multimodal sensor fusion for Ambient Assisted Living (AAL)
We address two major shortcomings of standard multimodal approaches, limited area coverage and reduced reliability.
Our new framework fuses the concept of modality hallucination with triplet learning to train a model with different modalities to handle missing sensors at inference time.
arXiv Detail & Related papers (2022-07-14T10:04:18Z) - Joint Spatial-Temporal and Appearance Modeling with Transformer for
Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects.
The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z) - CROMOSim: A Deep Learning-based Cross-modality Inertial Measurement
Simulator [7.50015216403068]
Inertial measurement unit (IMU) data has been utilized in monitoring and assessment of human mobility.
To mitigate the data scarcity problem, we design CROMOSim, a cross-modality sensor simulator.
It simulates high fidelity virtual IMU sensor data from motion capture systems or monocular RGB cameras.
arXiv Detail & Related papers (2022-02-21T22:30:43Z) - Improved YOLOv5 network for real-time multi-scale traffic sign detection [4.5598087061051755]
We propose an improved feature pyramid model, named AF-FPN, which utilize the adaptive attention module (AAM) and feature enhancement module (FEM) to reduce the information loss in the process of feature map generation.
We replace the original feature pyramid network in YOLOv5 with AF-FPN, which improves the detection performance for multi-scale targets of the YOLOv5 network.
arXiv Detail & Related papers (2021-12-16T11:02:12Z) - Object recognition for robotics from tactile time series data utilising
different neural network architectures [0.0]
This paper investigates the use of Convolutional Neural Networks (CNN) and Long-Short Term Memory (LSTM) neural network architectures for object classification on tactile data.
We compare these methods using data from two different fingertip sensors (namely the BioTac SP and WTS-FT) in the same physical setup.
The results show that the proposed method improves the maximum accuracy from 82.4% (BioTac SP fingertips) and 90.7% (WTS-FT fingertips) with complete time-series data to about 94% for both sensor types.
arXiv Detail & Related papers (2021-09-09T22:05:45Z) - SensiX++: Bringing MLOPs and Multi-tenant Model Serving to Sensory Edge
Devices [69.1412199244903]
We present a multi-tenant runtime for adaptive model execution with integrated MLOps on edge devices, e.g., a camera, a microphone, or IoT sensors.
S SensiX++ operates on two fundamental principles - highly modular componentisation to externalise data operations with clear abstractions and document-centric manifestation for system-wide orchestration.
We report on the overall throughput and quantified benefits of various automation components of SensiX++ and demonstrate its efficacy to significantly reduce operational complexity and lower the effort to deploy, upgrade, reconfigure and serve embedded models on edge devices.
arXiv Detail & Related papers (2021-09-08T22:06:16Z) - Object Tracking through Residual and Dense LSTMs [67.98948222599849]
Deep learning-based trackers based on LSTMs (Long Short-Term Memory) recurrent neural networks have emerged as a powerful alternative.
DenseLSTMs outperform Residual and regular LSTM, and offer a higher resilience to nuisances.
Our case study supports the adoption of residual-based RNNs for enhancing the robustness of other trackers.
arXiv Detail & Related papers (2020-06-22T08:20:17Z) - ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD)
ASFD is based on a combination of neural architecture search techniques as well as a new loss design.
Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.