ElectrodeNet -- A Deep Learning Based Sound Coding Strategy for Cochlear
Implants
- URL: http://arxiv.org/abs/2305.16753v1
- Date: Fri, 26 May 2023 09:06:04 GMT
- Title: ElectrodeNet -- A Deep Learning Based Sound Coding Strategy for Cochlear
Implants
- Authors: Enoch Hsin-Ho Huang, Rong Chao, Yu Tsao, Chao-Min Wu
- Abstract summary: ElectrodeNet is a deep learning based sound coding strategy for the cochlear implant (CI)
The extended ElectrodeNet-CS strategy further incorporates the channel selection (CS)
Network models of deep neural network (DNN), convolutional neural network (CNN) and long short-term memory (LSTM) were trained using the Fast Fourier Transformed bins and channel envelopes obtained from the processing of clean speech by the ACE strategy.
- Score: 9.468136300919062
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: ElectrodeNet, a deep learning based sound coding strategy for the cochlear
implant (CI), is proposed to emulate the advanced combination encoder (ACE)
strategy by replacing the conventional envelope detection using various
artificial neural networks. The extended ElectrodeNet-CS strategy further
incorporates the channel selection (CS). Network models of deep neural network
(DNN), convolutional neural network (CNN), and long short-term memory (LSTM)
were trained using the Fast Fourier Transformed bins and channel envelopes
obtained from the processing of clean speech by the ACE strategy. Objective
speech understanding using short-time objective intelligibility (STOI) and
normalized covariance metric (NCM) was estimated for ElectrodeNet using CI
simulations. Sentence recognition tests for vocoded Mandarin speech were
conducted with normal-hearing listeners. DNN, CNN, and LSTM based ElectrodeNets
exhibited strong correlations to ACE in objective and subjective scores using
mean squared error (MSE), linear correlation coefficient (LCC) and Spearman's
rank correlation coefficient (SRCC). The ElectrodeNet-CS strategy was capable
of producing N-of-M compatible electrode patterns using a modified DNN network
to embed maxima selection, and to perform in similar or even slightly higher
average in STOI and sentence recognition compared to ACE. The methods and
findings demonstrated the feasibility and potential of using deep learning in
CI coding strategy.
Related papers
- Multiscale fusion enhanced spiking neural network for invasive BCI neural signal decoding [13.108613110379961]
This paper presents a novel approach utilizing a Multiscale Fusion Fusion Spiking Neural Network (MFSNN)
MFSNN emulates the parallel processing and multiscale feature fusion seen in human visual perception to enable real-time, efficient, and energy-conserving neural signal decoding.
MFSNN surpasses traditional artificial neural network methods, such as enhanced GRU, in both accuracy and computational efficiency.
arXiv Detail & Related papers (2024-09-14T09:53:30Z) - Spiking Neural Network Decision Feedback Equalization [70.3497683558609]
We propose an SNN-based equalizer with a feedback structure akin to the decision feedback equalizer (DFE)
We show that our approach clearly outperforms conventional linear equalizers for three different exemplary channels.
The proposed SNN with a decision feedback structure enables the path to competitive energy-efficient transceivers.
arXiv Detail & Related papers (2022-11-09T09:19:15Z) - Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex.
In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z) - Convolutional Spiking Neural Networks for Detecting Anticipatory Brain Potentials Using Electroencephalogram [0.21847754147782888]
Spiking neural networks (SNNs) are receiving increased attention because they mimic synaptic connections in biological systems and produce spike trains.
Recently, the addition of convolutional layers to combine the feature extraction power of convolutional networks with the computational efficiency of SNNs has been introduced.
This paper studies the feasibility of using a convolutional spiking neural network (CSNN) to detect anticipatory slow cortical potentials.
arXiv Detail & Related papers (2022-08-14T19:04:15Z) - Low-bit Quantization of Recurrent Neural Network Language Models Using
Alternating Direction Methods of Multipliers [67.688697838109]
This paper presents a novel method to train quantized RNNLMs from scratch using alternating direction methods of multipliers (ADMM)
Experiments on two tasks suggest the proposed ADMM quantization achieved a model size compression factor of up to 31 times over the full precision baseline RNNLMs.
arXiv Detail & Related papers (2021-11-29T09:30:06Z) - BioLCNet: Reward-modulated Locally Connected Spiking Neural Networks [0.6193838300896449]
We propose a spiking neural network (SNN) trained using spike-timing-dependent plasticity (STDP) and its reward-modulated variant (R-STDP) learning rules.
Our network consists of a rate-coded input layer followed by a locally connected hidden layer and a decoding output layer.
We used the MNIST dataset to obtain image classification accuracy and to assess the robustness of our rewarding system to varying target responses.
arXiv Detail & Related papers (2021-09-12T15:28:48Z) - Decentralizing Feature Extraction with Quantum Convolutional Neural
Network for Automatic Speech Recognition [101.69873988328808]
We build upon a quantum convolutional neural network (QCNN) composed of a quantum circuit encoder for feature extraction.
An input speech is first up-streamed to a quantum computing server to extract Mel-spectrogram.
The corresponding convolutional features are encoded using a quantum circuit algorithm with random parameters.
The encoded features are then down-streamed to the local RNN model for the final recognition.
arXiv Detail & Related papers (2020-10-26T03:36:01Z) - Attention Driven Fusion for Multi-Modal Emotion Recognition [39.295892047505816]
We present a deep learning-based approach to exploit and fuse text and acoustic data for emotion classification.
We use a SincNet layer, based on parameterized sinc functions with band-pass filters, to extract acoustic features from raw audio followed by a DCNN.
For text processing, we use two branches (a DCNN and a Bi-direction RNN followed by a DCNN) in parallel where cross attention is introduced to infer the N-gram level correlations.
arXiv Detail & Related papers (2020-09-23T08:07:58Z) - Multi-Tones' Phase Coding (MTPC) of Interaural Time Difference by
Spiking Neural Network [68.43026108936029]
We propose a pure spiking neural network (SNN) based computational model for precise sound localization in the noisy real-world environment.
We implement this algorithm in a real-time robotic system with a microphone array.
The experiment results show a mean error azimuth of 13 degrees, which surpasses the accuracy of the other biologically plausible neuromorphic approach for sound source localization.
arXiv Detail & Related papers (2020-07-07T08:22:56Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Exploring Pre-training with Alignments for RNN Transducer based
End-to-End Speech Recognition [39.497407288772386]
recurrent neural network transducer (RNN-T) architecture has become an emerging trend in end-to-end automatic speech recognition research.
In this work, we leverage external alignments to seed the RNN-T model.
Two different pre-training solutions are explored, referred to as encoder pre-training, and whole-network pre-training respectively.
arXiv Detail & Related papers (2020-05-01T19:00:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.