Deep Convolutional Neural Network for Roadway Incident Surveillance
Using Audio Data
- URL: http://arxiv.org/abs/2203.06059v1
- Date: Wed, 9 Mar 2022 13:42:56 GMT
- Title: Deep Convolutional Neural Network for Roadway Incident Surveillance
Using Audio Data
- Authors: Zubayer Islam, Mohamed Abdel-Aty
- Abstract summary: Crash events identification and prediction plays a vital role in understanding safety conditions for transportation systems.
We propose the use of a novel sensory unit that can also accurately identify crash events: microphone.
Four events such as crash, tire skid, horn and siren sounds can be accurately identified giving indication of a road hazard.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Crash events identification and prediction plays a vital role in
understanding safety conditions for transportation systems. While existing
systems use traffic parameters correlated with crash data to classify and train
these models, we propose the use of a novel sensory unit that can also
accurately identify crash events: microphone. Audio events can be collected and
analyzed to classify events such as crash. In this paper, we have demonstrated
the use of a deep Convolutional Neural Network (CNN) for road event
classification. Important audio parameters such as Mel Frequency Cepstral
Coefficients (MFCC), log Mel-filterbank energy spectrum and Fourier Spectrum
were used as feature set. Additionally, the dataset was augmented with more
sample data by the use of audio augmentation techniques such as time and pitch
shifting. Together with the feature extraction this data augmentation can
achieve reasonable accuracy. Four events such as crash, tire skid, horn and
siren sounds can be accurately identified giving indication of a road hazard
that can be useful for traffic operators or paramedics. The proposed
methodology can reach accuracy up to 94%. Such audio systems can be implemented
as a part of an Internet of Things (IoT) platform that can complement
video-based sensors without complete coverage.
Related papers
- DiffSED: Sound Event Detection with Denoising Diffusion [70.18051526555512]
We reformulate the SED problem by taking a generative learning perspective.
Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process.
During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions.
arXiv Detail & Related papers (2023-08-14T17:29:41Z) - A Semi-Automated Corner Case Detection and Evaluation Pipeline [0.0]
Perception systems require large datasets for training their deep neural network.
Knowing which parts of the data in these datasets describe a corner case is an advantage during training or testing of the network.
We propose a pipeline that converts collective expert knowledge descriptions into the extended KI Absicherung ontology.
arXiv Detail & Related papers (2023-05-25T12:06:43Z) - Driver Maneuver Detection and Analysis using Time Series Segmentation
and Classification [7.413735713939367]
This paper implements a methodology for automatically detecting vehicle maneuvers from vehicle telemetry data under naturalistic driving settings.
Our objective is to develop an end-to-end pipeline for frame-by-frame annotation of naturalistic driving studies videos.
arXiv Detail & Related papers (2022-11-10T03:38:50Z) - Deep Spectro-temporal Artifacts for Detecting Synthesized Speech [57.42110898920759]
This paper provides an overall assessment of track 1 (Low-quality Fake Audio Detection) and track 2 (Partially Fake Audio Detection)
In this paper, spectro-temporal artifacts were detected using raw temporal signals, spectral features, as well as deep embedding features.
We ranked 4th and 5th in track 1 and track 2, respectively.
arXiv Detail & Related papers (2022-10-11T08:31:30Z) - Disentangled Representation Learning for RF Fingerprint Extraction under
Unknown Channel Statistics [77.13542705329328]
We propose a framework of disentangled representation learning(DRL) that first learns to factor the input signals into a device-relevant component and a device-irrelevant component via adversarial learning.
The implicit data augmentation in the proposed framework imposes a regularization on the RFF extractor to avoid the possible overfitting of device-irrelevant channel statistics.
Experiments validate that the proposed approach, referred to as DR-RFF, outperforms conventional methods in terms of generalizability to unknown complicated propagation environments.
arXiv Detail & Related papers (2022-08-04T15:46:48Z) - Audio-visual Representation Learning for Anomaly Events Detection in
Crowds [119.72951028190586]
This paper attempts to exploit multi-modal learning for modeling the audio and visual signals simultaneously.
We conduct the experiments on SHADE dataset, a synthetic audio-visual dataset in surveillance scenes.
We find introducing audio signals effectively improves the performance of anomaly events detection and outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-10-28T02:42:48Z) - Robust Feature Learning on Long-Duration Sounds for Acoustic Scene
Classification [54.57150493905063]
Acoustic scene classification (ASC) aims to identify the type of scene (environment) in which a given audio signal is recorded.
We propose a robust feature learning (RFL) framework to train the CNN.
arXiv Detail & Related papers (2021-08-11T03:33:05Z) - PILOT: Introducing Transformers for Probabilistic Sound Event
Localization [107.78964411642401]
This paper introduces a novel transformer-based sound event localization framework, where temporal dependencies in the received multi-channel audio signals are captured via self-attention mechanisms.
The framework is evaluated on three publicly available multi-source sound event localization datasets and compared against state-of-the-art methods in terms of localization error and event detection accuracy.
arXiv Detail & Related papers (2021-06-07T18:29:19Z) - Automatic Detection of Major Freeway Congestion Events Using Wireless
Traffic Sensor Data: A Machine Learning Approach [0.0]
This paper introduces a machine learning based approach for reliable detection and characterization of highway traffic congestion events.
The speed data is initially time-windowed by a ten-hour long sliding window and fed into three Neural Networks.
The sliding window captures each slowdown event multiple times and results in increased confidence in congestion detection.
arXiv Detail & Related papers (2020-07-09T21:38:45Z) - CURE Dataset: Ladder Networks for Audio Event Classification [15.850545634216484]
There are approximately 3M people with hearing loss who can't perceive events happening around them.
This paper establishes the CURE dataset which contains curated set of specific audio events most relevant for people with hearing loss.
arXiv Detail & Related papers (2020-01-12T09:35:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.