EchoVest: Real-Time Sound Classification and Depth Perception Expressed
  through Transcutaneous Electrical Nerve Stimulation
        - URL: http://arxiv.org/abs/2307.04604v1
- Date: Mon, 10 Jul 2023 14:43:32 GMT
- Title: EchoVest: Real-Time Sound Classification and Depth Perception Expressed
  through Transcutaneous Electrical Nerve Stimulation
- Authors: Jesse Choe, Siddhant Sood, Ryan Park
- Abstract summary: We have developed a new assistive device, EchoVest, for blind/deaf people to intuitively become more aware of their environment.
 EchoVest transmits vibrations to the user's body by utilizing transcutaneous electric nerve stimulation (TENS) based on the source of the sounds.
We aimed to outperform CNN-based machine-learning models, the most commonly used machine learning model for classification tasks, in accuracy and computational costs.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract:   Over 1.5 billion people worldwide live with hearing impairment. Despite
various technologies that have been created for individuals with such
disabilities, most of these technologies are either extremely expensive or
inaccessible for everyday use in low-medium income countries. In order to
combat this issue, we have developed a new assistive device, EchoVest, for
blind/deaf people to intuitively become more aware of their environment.
EchoVest transmits vibrations to the user's body by utilizing transcutaneous
electric nerve stimulation (TENS) based on the source of the sounds. EchoVest
also provides various features, including sound localization, sound
classification, noise reduction, and depth perception. We aimed to outperform
CNN-based machine-learning models, the most commonly used machine learning
model for classification tasks, in accuracy and computational costs. To do so,
we developed and employed a novel audio pipeline that adapts the Audio
Spectrogram Transformer (AST) model, an attention-based model, for our sound
classification purposes, and Fast Fourier Transforms for noise reduction. The
application of Otsu's Method helped us find the optimal thresholds for
background noise sound filtering and gave us much greater accuracy. In order to
calculate direction and depth accurately, we applied Complex Time Difference of
Arrival algorithms and SOTA localization. Our last improvement was to use blind
source separation to make our algorithms applicable to multiple microphone
inputs. The final algorithm achieved state-of-the-art results on numerous
checkpoints, including a 95.7\% accuracy on the ESC-50 dataset for
environmental sound classification.
 
      
        Related papers
        - Quantized Approximate Signal Processing (QASP): Towards Homomorphic   Encryption for audio [1.3584036432145363]
 Homomorphic encryption (FHE) offers a promising solution by enabling computations on encrypted data and preserving user privacy.<n>Here, we introduce a fully secure pipeline that computes, with FHE and quantized neural network operations.<n>Our methods also support the private computation of audio descriptors and convolutional neural network (CNN) classifiers.
 arXiv  Detail & Related papers  (2025-05-15T17:01:52Z)
- AADNet: Exploring EEG Spatiotemporal Information for Fast and Accurate   Orientation and Timbre Detection of Auditory Attention Based on A Cue-Masked   Paradigm [4.479495549911642]
 Auditory attention decoding from electroencephalogram (EEG) could infer to which source the user is attending in noisy environments.
This study proposed a cue-masked auditory attention paradigm to avoid information leakage before the experiment.
An end-to-end deep learning model, AADNet, was proposed to exploit thetemporal information from the short time window EEG signals.
 arXiv  Detail & Related papers  (2025-01-07T06:51:17Z)
- DeepSpeech models show Human-like Performance and Processing of Cochlear   Implant Inputs [12.234206036041218]
 We use the deep neural network (DNN) DeepSpeech2 as a paradigm to investigate how natural input and cochlear implant-based inputs are processed over time.
We generate naturalistic and cochlear implant-like inputs from spoken sentences and test the similarity of model performance to human performance.
We find that dynamics over time in each layer are affected by context as well as input type.
 arXiv  Detail & Related papers  (2024-07-30T04:32:27Z)
- What to Remember: Self-Adaptive Continual Learning for Audio Deepfake
  Detection [53.063161380423715]
 Existing detection models have shown remarkable success in discriminating known deepfake audio, but struggle when encountering new attack types.
We propose a continual learning approach called Radian Weight Modification (RWM) for audio deepfake detection.
 arXiv  Detail & Related papers  (2023-12-15T09:52:17Z)
- Utilizing synthetic training data for the supervised classification of
  rat ultrasonic vocalizations [0.0]
 Murine rodents generate ultrasonic vocalizations (USVs) with frequencies that extend to around 120kHz.
These calls are important in social behaviour, and so their analysis can provide insights into the function of vocal communication, and its dysfunction.
We compare the detection and classification performance of a trained human against two convolutional neural networks (CNNs), DeepSqueak and VocalMat, on audio containing rat USVs.
 arXiv  Detail & Related papers  (2023-03-03T03:17:45Z)
- Walking Noise: On Layer-Specific Robustness of Neural Architectures   against Noisy Computations and Associated Characteristic Learning Dynamics [1.5184189132709105]
 We discuss the implications of additive, multiplicative and mixed noise for different classification tasks and model architectures.
We propose a methodology called Walking Noise which injects layer-specific noise to measure the robustness.
We conclude with a discussion of the use of this methodology in practice, among others, discussing its use for tailored multi-execution in noisy environments.
 arXiv  Detail & Related papers  (2022-12-20T17:09:08Z)
- High Fidelity Neural Audio Compression [92.4812002532009]
 We introduce a state-of-the-art real-time, high-fidelity, audio leveraging neural networks.
It consists in a streaming encoder-decoder architecture with quantized latent space trained in an end-to-end fashion.
We simplify and speed-up the training by using a single multiscale spectrogram adversary.
 arXiv  Detail & Related papers  (2022-10-24T17:52:02Z)
- Fully Automated End-to-End Fake Audio Detection [57.78459588263812]
 This paper proposes a fully automated end-toend fake audio detection method.
We first use wav2vec pre-trained model to obtain a high-level representation of the speech.
For the network structure, we use a modified version of the differentiable architecture search (DARTS) named light-DARTS.
 arXiv  Detail & Related papers  (2022-08-20T06:46:55Z)
- Deep Feature Learning for Medical Acoustics [78.56998585396421]
 The purpose of this paper is to compare different learnables in medical acoustics tasks.
A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies.
 arXiv  Detail & Related papers  (2022-08-05T10:39:37Z)
- End-to-End Binaural Speech Synthesis [71.1869877389535]
 We present an end-to-end speech synthesis system that combines a low-bitrate audio system with a powerful decoder.
We demonstrate the capability of the adversarial loss in capturing environment effects needed to create an authentic auditory scene.
 arXiv  Detail & Related papers  (2022-07-08T05:18:36Z)
- SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with
  Adaptive Noise Spectral Shaping [51.698273019061645]
 SpecGrad adapts the diffusion noise so that its time-varying spectral envelope becomes close to the conditioning log-mel spectrogram.
It is processed in the time-frequency domain to keep the computational cost almost the same as the conventional DDPM-based neural vocoders.
 arXiv  Detail & Related papers  (2022-03-31T02:08:27Z)
- Neural Architecture Search for Energy Efficient Always-on Audio Models [1.3846912186423144]
 We present several changes to neural architecture searches (NAS) that improve the chance of success in practical situations.
We benchmark the performance of our search on real hardware, but since running thousands of tests with real hardware is difficult we use a random forest model to roughly predict the energy usage of a candidate network.
Our search, evaluated on a sound-event classification dataset based upon AudioSet, results in an order of magnitude less energy per inference and a much smaller memory footprint.
 arXiv  Detail & Related papers  (2022-02-09T06:10:18Z)
- Deep Neural Networks on EEG Signals to Predict Auditory Attention Score
  Using Gramian Angular Difference Field [1.9899603776429056]
 In some sense, the auditory attention score of an individual shows the focus the person can have in auditory tasks.
The recent advancements in deep learning and in the non-invasive technologies recording neural activity beg the question, can deep learning along with technologies such as electroencephalography (EEG) be used to predict the auditory attention score of an individual?
In this paper, we focus on this very problem of estimating a person's auditory attention level based on their brain's electrical activity captured using 14-channeled EEG signals.
 arXiv  Detail & Related papers  (2021-10-24T17:58:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.