Artificial Intelligence for Cochlear Implants: Review of Strategies, Challenges, and Perspectives
- URL: http://arxiv.org/abs/2403.15442v2
- Date: Sun, 21 Jul 2024 21:33:33 GMT
- Title: Artificial Intelligence for Cochlear Implants: Review of Strategies, Challenges, and Perspectives
- Authors: Billel Essaid, Hamza Kheddar, Noureddine Batel, Muhammad E. H. Chowdhury, Abderrahmane Lakas,
- Abstract summary: This review aims to comprehensively cover advancements in CI-based ASR and speech enhancement, among other related aspects.
The review will delve into potential applications and suggest future directions to bridge existing research gaps in this domain.
- Score: 2.608119698700597
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic speech recognition (ASR) plays a pivotal role in our daily lives, offering utility not only for interacting with machines but also for facilitating communication for individuals with partial or profound hearing impairments. The process involves receiving the speech signal in analog form, followed by various signal processing algorithms to make it compatible with devices of limited capacities, such as cochlear implants (CIs). Unfortunately, these implants, equipped with a finite number of electrodes, often result in speech distortion during synthesis. Despite efforts by researchers to enhance received speech quality using various state-of-the-art (SOTA) signal processing techniques, challenges persist, especially in scenarios involving multiple sources of speech, environmental noise, and other adverse conditions. The advent of new artificial intelligence (AI) methods has ushered in cutting-edge strategies to address the limitations and difficulties associated with traditional signal processing techniques dedicated to CIs. This review aims to comprehensively cover advancements in CI-based ASR and speech enhancement, among other related aspects. The primary objective is to provide a thorough overview of metrics and datasets, exploring the capabilities of AI algorithms in this biomedical field, and summarizing and commenting on the best results obtained. Additionally, the review will delve into potential applications and suggest future directions to bridge existing research gaps in this domain.
Related papers
- SONAR: A Synthetic AI-Audio Detection Framework and Benchmark [59.09338266364506]
SONAR is a synthetic AI-Audio Detection Framework and Benchmark.
It aims to provide a comprehensive evaluation for distinguishing cutting-edge AI-synthesized auditory content.
It is the first framework to uniformly benchmark AI-audio detection across both traditional and foundation model-based deepfake detection systems.
arXiv Detail & Related papers (2024-10-06T01:03:42Z) - Advanced Clustering Techniques for Speech Signal Enhancement: A Review and Metanalysis of Fuzzy C-Means, K-Means, and Kernel Fuzzy C-Means Methods [0.6530047924748276]
Speech signal processing is tasked with improving the clarity and comprehensibility of audio data in noisy environments.
The quality of speech recognition directly impacts user experience and accessibility in technology-driven communication.
This review paper explores advanced clustering techniques, particularly focusing on the Kernel Fuzzy C-Means (KFCM) method.
arXiv Detail & Related papers (2024-09-28T20:21:05Z) - DeepSpeech models show Human-like Performance and Processing of Cochlear Implant Inputs [12.234206036041218]
We use the deep neural network (DNN) DeepSpeech2 as a paradigm to investigate how natural input and cochlear implant-based inputs are processed over time.
We generate naturalistic and cochlear implant-like inputs from spoken sentences and test the similarity of model performance to human performance.
We find that dynamics over time in each layer are affected by context as well as input type.
arXiv Detail & Related papers (2024-07-30T04:32:27Z) - A Comparative Study of Perceptual Quality Metrics for Audio-driven
Talking Head Videos [81.54357891748087]
We collect talking head videos generated from four generative methods.
We conduct controlled psychophysical experiments on visual quality, lip-audio synchronization, and head movement naturalness.
Our experiments validate consistency between model predictions and human annotations, identifying metrics that align better with human opinions than widely-used measures.
arXiv Detail & Related papers (2024-03-11T04:13:38Z) - ML-ASPA: A Contemplation of Machine Learning-based Acoustic Signal
Processing Analysis for Sounds, & Strains Emerging Technology [0.0]
This inquiry explores recent advancements and transformative potential within the domain of acoustics, specifically focusing on machine learning (ML) and deep learning.
ML adopts a data-driven approach, unveiling intricate relationships between features and desired labels or actions, as well as among features themselves.
The application of ML to expansive sets of training data facilitates the discovery of models elucidating complex acoustic phenomena such as human speech and reverberation.
arXiv Detail & Related papers (2023-12-18T03:04:42Z) - Generative AI for Physical Layer Communications: A Survey [76.61956357178295]
generative artificial intelligence (GAI) has the potential to enhance the efficiency of digital content production.
GAI's capability in analyzing complex data distributions offers great potential for wireless communications.
This paper presents a comprehensive investigation of GAI's applications for communications at the physical layer, ranging from traditional issues, including signal classification, channel estimation, and equalization, to emerging topics, such as intelligent reflecting surfaces and joint source channel coding.
arXiv Detail & Related papers (2023-12-09T15:20:56Z) - A Comprehensive Study on Artificial Intelligence Algorithms to Implement
Safety Using Communication Technologies [1.2710179245406195]
The study aims at providing a comprehensive picture of the state of the art AI based safety solutions that uses different communication technologies.
The results demonstrate that automotive domain is the one applying AI and communication the most to implement safety.
The use of non-cellular communication technologies is dominant however a clear trend of a rapid increase in the use of cellular communication is observed specially from 2020 with the roll-out of 5G technology.
arXiv Detail & Related papers (2022-05-17T14:38:38Z) - Signal Processing and Machine Learning Techniques for Terahertz Sensing:
An Overview [89.09270073549182]
Terahertz (THz) signal generation and radiation methods are shaping the future of wireless systems.
THz-specific signal processing techniques should complement this re-surged interest in THz sensing for efficient utilization of the THz band.
We present an overview of these techniques, with an emphasis on signal pre-processing.
We also address the effectiveness of deep learning techniques by exploring their promising sensing capabilities at the THz band.
arXiv Detail & Related papers (2021-04-09T01:38:34Z) - Silent Speech Interfaces for Speech Restoration: A Review [59.68902463890532]
Silent speech interface (SSI) research aims to provide alternative and augmentative communication methods for persons with severe speech disorders.
SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication.
Most present-day SSIs have only been validated in laboratory settings for healthy users.
arXiv Detail & Related papers (2020-09-04T11:05:50Z) - Machine Learning in Nano-Scale Biomedical Engineering [77.75587007080894]
We review the existing research regarding the use of machine learning in nano-scale biomedical engineering.
The main challenges that can be formulated as ML problems are classified into the three main categories.
For each of the presented methodologies, special emphasis is given to its principles, applications, and limitations.
arXiv Detail & Related papers (2020-08-05T15:45:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.