DHASP: Differentiable Hearing Aid Speech Processing
- URL: http://arxiv.org/abs/2103.08569v1
- Date: Mon, 15 Mar 2021 17:34:22 GMT
- Title: DHASP: Differentiable Hearing Aid Speech Processing
- Authors: Zehai Tu, Ning Ma, Jon Barker
- Abstract summary: An appropriate amplification fitting for the listener's hearing disability is critical for good performance.
In this paper, we introduce an alternative approach to finding the optimal fitting by introducing a hearing aid speech processing framework.
The framework is fully differentiable, thus can employ the back-propagation algorithm for efficient, data-driven optimisation.
Our initial objective experiments show promising results for noise-free speech amplification, where the automatically optimised processors outperform one of the well recognised hearing aid prescriptions.
- Score: 23.101074347473904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hearing aids are expected to improve speech intelligibility for listeners
with hearing impairment. An appropriate amplification fitting tuned for the
listener's hearing disability is critical for good performance. The
developments of most prescriptive fittings are based on data collected in
subjective listening experiments, which are usually expensive and
time-consuming. In this paper, we explore an alternative approach to finding
the optimal fitting by introducing a hearing aid speech processing framework,
in which the fitting is optimised in an automated way using an intelligibility
objective function based on the HASPI physiological auditory model. The
framework is fully differentiable, thus can employ the back-propagation
algorithm for efficient, data-driven optimisation. Our initial objective
experiments show promising results for noise-free speech amplification, where
the automatically optimised processors outperform one of the well recognised
hearing aid prescriptions.
Related papers
- Lightly Weighted Automatic Audio Parameter Extraction for the Quality
Assessment of Consensus Auditory-Perceptual Evaluation of Voice [18.8222742272435]
The proposed method utilizes age, sex, and five audio parameters: jitter, absolute jitter, shimmer, harmonic-to-noise ratio (HNR), and zero crossing.
The result reveals that our approach performs similar to state-of-the-art (SOTA) methods, and outperforms the latent representation obtained by using popular audio pre-trained models.
arXiv Detail & Related papers (2023-11-27T07:19:22Z) - DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification [55.306583814017046]
We present a novel difficulty-aware semantic augmentation (DASA) approach for speaker verification.
DASA generates diversified training samples in speaker embedding space with negligible extra computing cost.
The best result achieves a 14.6% relative reduction in EER metric on CN-Celeb evaluation set.
arXiv Detail & Related papers (2023-10-18T17:07:05Z) - Deep Feature Learning for Medical Acoustics [78.56998585396421]
The purpose of this paper is to compare different learnables in medical acoustics tasks.
A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies.
arXiv Detail & Related papers (2022-08-05T10:39:37Z) - End-to-End Binaural Speech Synthesis [71.1869877389535]
We present an end-to-end speech synthesis system that combines a low-bitrate audio system with a powerful decoder.
We demonstrate the capability of the adversarial loss in capturing environment effects needed to create an authentic auditory scene.
arXiv Detail & Related papers (2022-07-08T05:18:36Z) - MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility
Prediction Model for Hearing Aids [22.736703635666164]
We propose a multi-branched speech intelligibility prediction model (MBI-Net) for predicting subjective intelligibility scores of hearing aid (HA) users.
The outputs of the two branches are fused through a linear layer to obtain predicted speech intelligibility scores.
arXiv Detail & Related papers (2022-04-07T09:13:44Z) - Towards Intelligibility-Oriented Audio-Visual Speech Enhancement [8.19144665585397]
We present a fully convolutional AV SE model that uses a modified short-time objective intelligibility (STOI) metric as a training cost function.
Our proposed I-O AV SE framework outperforms audio-only (AO) and AV models trained with conventional distance-based loss functions.
arXiv Detail & Related papers (2021-11-18T11:47:37Z) - HASA-net: A non-intrusive hearing-aid speech assessment network [52.83357278948373]
We propose a DNN-based hearing aid speech assessment network (HASA-Net) to predict speech quality and intelligibility scores simultaneously.
To the best of our knowledge, HASA-Net is the first work to incorporate quality and intelligibility assessments utilizing a unified DNN-based non-intrusive model for hearing aids.
Experimental results show that the predicted speech quality and intelligibility scores of HASA-Net are highly correlated to two well-known intrusive hearing-aid evaluation metrics.
arXiv Detail & Related papers (2021-11-10T14:10:13Z) - PL-EESR: Perceptual Loss Based END-TO-END Robust Speaker Representation
Extraction [90.55375210094995]
Speech enhancement aims to improve the perceptual quality of the speech signal by suppression of the background noise.
We propose an end-to-end deep learning framework, dubbed PL-EESR, for robust speaker representation extraction.
arXiv Detail & Related papers (2021-10-03T07:05:29Z) - Personalization of Hearing Aid Compression by Human-In-Loop Deep
Reinforcement Learning [3.402787708517184]
Existing prescriptive compression strategies used in hearing aid fitting are designed based on gain averages from a group of users which are not necessarily optimal for a specific user.
This paper presents a human-in-loop deep reinforcement learning approach that personalizes hearing aid compression to achieve improved hearing perception.
arXiv Detail & Related papers (2020-07-01T02:50:33Z) - Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention [70.82604384963679]
This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features.
We extract a speaker representation used for adaptation directly from the test utterance.
arXiv Detail & Related papers (2020-02-14T05:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.