Successes and critical failures of neural networks in capturing
human-like speech recognition
- URL: http://arxiv.org/abs/2204.03740v4
- Date: Wed, 19 Apr 2023 12:12:17 GMT
- Title: Successes and critical failures of neural networks in capturing
human-like speech recognition
- Authors: Federico Adolfi, Jeffrey S. Bowers, David Poeppel
- Abstract summary: Speech recognition is inherently robust in humans to a number transformations at various spectrotemporal granularities.
We evaluate state-of-the-art neural networks as stimulus-computable, optimized observers.
- Score: 1.1602089225841632
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Natural and artificial audition can in principle acquire different solutions
to a given problem. The constraints of the task, however, can nudge the
cognitive science and engineering of audition to qualitatively converge,
suggesting that a closer mutual examination would potentially enrich artificial
hearing systems and process models of the mind and brain. Speech recognition -
an area ripe for such exploration - is inherently robust in humans to a number
transformations at various spectrotemporal granularities. To what extent are
these robustness profiles accounted for by high-performing neural network
systems? We bring together experiments in speech recognition under a single
synthesis framework to evaluate state-of-the-art neural networks as
stimulus-computable, optimized observers. In a series of experiments, we (1)
clarify how influential speech manipulations in the literature relate to each
other and to natural speech, (2) show the granularities at which machines
exhibit out-of-distribution robustness, reproducing classical perceptual
phenomena in humans, (3) identify the specific conditions where model
predictions of human performance differ, and (4) demonstrate a crucial failure
of all artificial systems to perceptually recover where humans do, suggesting
alternative directions for theory and model building. These findings encourage
a tighter synergy between the cognitive science and engineering of audition.
Related papers
- Exploring neural oscillations during speech perception via surrogate gradient spiking neural networks [59.38765771221084]
We present a physiologically inspired speech recognition architecture compatible and scalable with deep learning frameworks.
We show end-to-end gradient descent training leads to the emergence of neural oscillations in the central spiking neural network.
Our findings highlight the crucial inhibitory role of feedback mechanisms, such as spike frequency adaptation and recurrent connections, in regulating and synchronising neural activity to improve recognition performance.
arXiv Detail & Related papers (2024-04-22T09:40:07Z) - Exploring mechanisms of Neural Robustness: probing the bridge between geometry and spectrum [0.0]
We study the link between representation smoothness and spectrum by using weight, Jacobian and spectral regularization.
Our research aims to understand the interplay between geometry, spectral properties, robustness, and expressivity in neural representations.
arXiv Detail & Related papers (2024-02-05T12:06:00Z) - Brain-Inspired Machine Intelligence: A Survey of
Neurobiologically-Plausible Credit Assignment [65.268245109828]
We examine algorithms for conducting credit assignment in artificial neural networks that are inspired or motivated by neurobiology.
We organize the ever-growing set of brain-inspired learning schemes into six general families and consider these in the context of backpropagation of errors.
The results of this review are meant to encourage future developments in neuro-mimetic systems and their constituent learning processes.
arXiv Detail & Related papers (2023-12-01T05:20:57Z) - A Neuro-mimetic Realization of the Common Model of Cognition via Hebbian
Learning and Free Energy Minimization [55.11642177631929]
Large neural generative models are capable of synthesizing semantically rich passages of text or producing complex images.
We discuss the COGnitive Neural GENerative system, such an architecture that casts the Common Model of Cognition.
arXiv Detail & Related papers (2023-10-14T23:28:48Z) - Data-driven emotional body language generation for social robotics [58.88028813371423]
In social robotics, endowing humanoid robots with the ability to generate bodily expressions of affect can improve human-robot interaction and collaboration.
We implement a deep learning data-driven framework that learns from a few hand-designed robotic bodily expressions.
The evaluation study found that the anthropomorphism and animacy of the generated expressions are not perceived differently from the hand-designed ones.
arXiv Detail & Related papers (2022-05-02T09:21:39Z) - Predictive Coding and Stochastic Resonance: Towards a Unified Theory of
Auditory (Phantom) Perception [6.416574036611064]
To gain a mechanistic understanding of brain function, hypothesis driven experiments should be accompanied by biologically plausible computational models.
With a special focus on tinnitus, we review recent work at the intersection of artificial intelligence, psychology, and neuroscience.
We conclude that two fundamental processing principles - being ubiquitous in the brain - best fit to a vast number of experimental results.
arXiv Detail & Related papers (2022-04-07T10:47:58Z) - The world seems different in a social context: a neural network analysis
of human experimental data [57.729312306803955]
We show that it is possible to replicate human behavioral data in both individual and social task settings by modifying the precision of prior and sensory signals.
An analysis of the neural activation traces of the trained networks provides evidence that information is coded in fundamentally different ways in the network in the individual and in the social conditions.
arXiv Detail & Related papers (2022-03-03T17:19:12Z) - Deep Interpretable Models of Theory of Mind For Human-Agent Teaming [0.7734726150561086]
We develop an interpretable modular neural framework for modeling the intentions of other observed entities.
We demonstrate the efficacy of our approach with experiments on data from human participants on a search and rescue task in Minecraft.
arXiv Detail & Related papers (2021-04-07T06:18:58Z) - Understanding Information Processing in Human Brain by Interpreting
Machine Learning Models [1.14219428942199]
The thesis explores the role machine learning methods play in creating intuitive computational models of neural processing.
This perspective makes the case in favor of the larger role that exploratory and data-driven approach to computational neuroscience could play.
arXiv Detail & Related papers (2020-10-17T04:37:26Z) - Machine Common Sense [77.34726150561087]
Machine common sense remains a broad, potentially unbounded problem in artificial intelligence (AI)
This article deals with the aspects of modeling commonsense reasoning focusing on such domain as interpersonal interactions.
arXiv Detail & Related papers (2020-06-15T13:59:47Z) - Bio-Inspired Modality Fusion for Active Speaker Detection [1.0644456464343592]
This paper presents a methodology for fusing correlated auditory and visual information for active speaker detection.
The ability can have a wide range of applications, from teleconferencing systems to social robotics.
arXiv Detail & Related papers (2020-02-28T20:56:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.