The AI Mechanic: Acoustic Vehicle Characterization Neural Networks
- URL: http://arxiv.org/abs/2205.09667v1
- Date: Thu, 19 May 2022 16:29:26 GMT
- Title: The AI Mechanic: Acoustic Vehicle Characterization Neural Networks
- Authors: Adam M. Terwilliger, Joshua E. Siegel
- Abstract summary: We introduce the AI mechanic, an acoustic vehicle characterization deep learning system, using sound captured from mobile devices.
We build a convolutional neural network that predicts and cascades vehicle attributes to enhance fault detection.
Our cascading architecture additionally achieved 93.6% validation and 86.8% test set accuracy on misfire fault prediction, demonstrating margins of 16.4% / 7.8% and 4.2% / 1.5% improvement over na"ive and parallel baselines.
- Score: 1.8275108630751837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a world increasingly dependent on road-based transportation, it is
essential to understand vehicles. We introduce the AI mechanic, an acoustic
vehicle characterization deep learning system, as an integrated approach using
sound captured from mobile devices to enhance transparency and understanding of
vehicles and their condition for non-expert users. We develop and implement
novel cascading architectures for vehicle understanding, which we define as
sequential, conditional, multi-level networks that process raw audio to extract
highly-granular insights. To showcase the viability of cascading architectures,
we build a multi-task convolutional neural network that predicts and cascades
vehicle attributes to enhance fault detection. We train and test these models
on a synthesized dataset reflecting more than 40 hours of augmented audio and
achieve >92% validation set accuracy on attributes (fuel type, engine
configuration, cylinder count and aspiration type). Our cascading architecture
additionally achieved 93.6% validation and 86.8% test set accuracy on misfire
fault prediction, demonstrating margins of 16.4% / 7.8% and 4.2% / 1.5%
improvement over na\"ive and parallel baselines. We explore experimental
studies focused on acoustic features, data augmentation, feature fusion, and
data reliability. Finally, we conclude with a discussion of broader
implications, future directions, and application areas for this work.
Related papers
- Contrastive Learning-Driven Traffic Sign Perception: Multi-Modal Fusion of Text and Vision [2.0720154517628417]
We propose a novel framework combining open-vocabulary detection and cross-modal learning.<n>For traffic sign detection, our NanoVerse YOLO model integrates a vision-language path aggregation network (RepVL-PAN) and an SPD-Conv module.<n>For traffic sign classification, we designed a Traffic Sign Recognition Multimodal Contrastive Learning model (TSR-MCL)<n>On the TT100K dataset, our method achieves a state-of-the-art 78.4% mAP in the long-tail detection task for all-class recognition.
arXiv Detail & Related papers (2025-07-31T08:23:30Z) - Learning to Drive by Imitating Surrounding Vehicles [0.8902959815221526]
We study a data augmentation strategy that leverages the observed trajectories of nearby vehicles as additional demonstrations.<n>We introduce a simple vehicle-selection sampling and filtering strategy that prioritizes informative and diverse driving behaviors.<n>Specifically, the approach reduces collision rates and improves safety metrics compared to the baseline.
arXiv Detail & Related papers (2025-03-08T00:40:47Z) - FollowGen: A Scaled Noise Conditional Diffusion Model for Car-Following Trajectory Prediction [9.2729178775419]
This study introduces a scaled noise conditional diffusion model for car-following trajectory prediction.
It integrates detailed inter-vehicular interactions and car-following dynamics into a generative framework, improving the accuracy and plausibility of predicted trajectories.
Experimental results on diverse real-world driving scenarios demonstrate the state-of-the-art performance and robustness of the proposed method.
arXiv Detail & Related papers (2024-11-23T23:13:45Z) - Transforming In-Vehicle Network Intrusion Detection: VAE-based Knowledge Distillation Meets Explainable AI [0.0]
This paper introduces an advanced intrusion detection system (IDS) called KD-XVAE that uses a Variational Autoencoder (VAE)-based knowledge distillation approach.
Our model significantly reduces complexity, operating with just 1669 parameters and achieving an inference time of 0.3 ms per batch.
arXiv Detail & Related papers (2024-10-11T17:57:16Z) - Advancements in Road Lane Mapping: Comparative Fine-Tuning Analysis of Deep Learning-based Semantic Segmentation Methods Using Aerial Imagery [16.522544814241495]
This research addresses the need for high-definition (HD) maps for autonomous vehicles (AVs)
Earth observation data offers valuable resources for map creation, but specialized models for road lane extraction are still underdeveloped in remote sensing.
In this study, we compare twelve foundational deep learning-based semantic segmentation models for road lane marking extraction from high-definition remote sensing images.
arXiv Detail & Related papers (2024-10-08T06:24:15Z) - Optimizing LaneSegNet for Real-Time Lane Topology Prediction in Autonomous Vehicles [0.41942958779358663]
LaneSegNet is a new approach to lane topology prediction which integrates topological information with lane-line data.
This study explores optimizations to the LaneSegNet architecture through feature extractor modification and transformer encoder-decoder stack modification.
Our implementation, trained on a single NVIDIA Tesla A100 GPU, found that a 2:4 ratio reduced training time by 22.3% with only a 7.1% drop in mean average precision.
A 4:8 ratio increased training time by only 11.1% but improved mean average precision by a significant 23.7%.
arXiv Detail & Related papers (2024-06-22T21:49:12Z) - Exploring Contextual Representation and Multi-Modality for End-to-End
Autonomous Driving [58.879758550901364]
Recent perception systems enhance spatial understanding with sensor fusion but often lack full environmental context.
We introduce a framework that integrates three cameras to emulate the human field of view, coupled with top-down bird-eye-view semantic data to enhance contextual representation.
Our method achieves displacement error by 0.67m in open-loop settings, surpassing current methods by 6.9% on the nuScenes dataset.
arXiv Detail & Related papers (2022-10-13T05:56:20Z) - Fully Automated End-to-End Fake Audio Detection [57.78459588263812]
This paper proposes a fully automated end-toend fake audio detection method.
We first use wav2vec pre-trained model to obtain a high-level representation of the speech.
For the network structure, we use a modified version of the differentiable architecture search (DARTS) named light-DARTS.
arXiv Detail & Related papers (2022-08-20T06:46:55Z) - Attention-based Neural Network for Driving Environment Complexity
Perception [123.93460670568554]
This paper proposes a novel attention-based neural network model to predict the complexity level of the surrounding driving environment.
It consists of a Yolo-v3 object detection algorithm, a heat map generation algorithm, CNN-based feature extractors, and attention-based feature extractors.
The proposed attention-based network achieves 91.22% average classification accuracy to classify the surrounding environment complexity.
arXiv Detail & Related papers (2021-06-21T17:27:11Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Detecting 32 Pedestrian Attributes for Autonomous Vehicles [103.87351701138554]
In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes.
We introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way.
We show competitive detection and attribute recognition results, as well as a more stable MTL training.
arXiv Detail & Related papers (2020-12-04T15:10:12Z) - Scene-Graph Augmented Data-Driven Risk Assessment of Autonomous Vehicle
Decisions [1.4086978333609153]
We propose a novel data-driven approach that uses scene-graphs as intermediate representations.
Our approach includes a Multi-Relation Graph Convolution Network, a Long-Short Term Memory Network, and attention layers for modeling the subjective risk of driving maneuvers.
We show that our approach achieves a higher classification accuracy than the state-of-the-art approach on both large (96.4% vs. 91.2%) and small (91.8% vs. 71.2%)
We also show that our model trained on a synthesized dataset achieves an average accuracy of 87.8% when tested on a real-world dataset.
arXiv Detail & Related papers (2020-08-31T07:41:27Z) - Towards Automated Neural Interaction Discovery for Click-Through Rate
Prediction [64.03526633651218]
Click-Through Rate (CTR) prediction is one of the most important machine learning tasks in recommender systems.
We propose an automated interaction architecture discovering framework for CTR prediction named AutoCTR.
arXiv Detail & Related papers (2020-06-29T04:33:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.