A Comprehensive Comparison of Neural Networks as Cognitive Models of
Inflection
- URL: http://arxiv.org/abs/2210.12321v1
- Date: Sat, 22 Oct 2022 00:59:40 GMT
- Title: A Comprehensive Comparison of Neural Networks as Cognitive Models of
Inflection
- Authors: Adam Wiemerslage and Shiran Dudy and Katharina Kann
- Abstract summary: We study the correlation between human judgments and neural network probabilities for unknown word inflections.
We find evidence that the Transformer may be a better account of human behavior than LSTMs.
- Score: 20.977461918631928
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural networks have long been at the center of a debate around the cognitive
mechanism by which humans process inflectional morphology. This debate has
gravitated into NLP by way of the question: Are neural networks a feasible
account for human behavior in morphological inflection? We address that
question by measuring the correlation between human judgments and neural
network probabilities for unknown word inflections. We test a larger range of
architectures than previously studied on two important tasks for the cognitive
processing debate: English past tense, and German number inflection. We find
evidence that the Transformer may be a better account of human behavior than
LSTMs on these datasets, and that LSTM features known to increase inflection
accuracy do not always result in more human-like behavior.
Related papers
- Exploring Behavior-Relevant and Disentangled Neural Dynamics with Generative Diffusion Models [2.600709013150986]
Understanding the neural basis of behavior is a fundamental goal in neuroscience.
Our approach, named BeNeDiff'', first identifies a fine-grained and disentangled neural subspace.
It then employs state-of-the-art generative diffusion models to synthesize behavior videos that interpret the neural dynamics of each latent factor.
arXiv Detail & Related papers (2024-10-12T18:28:56Z) - Cognitive Networks and Performance Drive fMRI-Based State Classification Using DNN Models [0.0]
We employ two structurally different and complementary DNN-based models to classify individual cognitive states.
We show that despite the architectural differences, both models consistently produce a robust relationship between prediction accuracy and individual cognitive performance.
arXiv Detail & Related papers (2024-08-14T15:25:51Z) - Too Big to Fail: Larger Language Models are Disproportionately Resilient to Induction of Dementia-Related Linguistic Anomalies [7.21603206617401]
We show that larger GPT-2 models require a disproportionately larger share of attention heads to be masked/ablated to display degradation magnitude to masking.
These results suggest that the attention mechanism in transformer models may present an analogue to the notions of cognitive and brain reserve.
arXiv Detail & Related papers (2024-06-05T00:31:50Z) - Automated Natural Language Explanation of Deep Visual Neurons with Large
Models [43.178568768100305]
This paper proposes a novel post-hoc framework for generating semantic explanations of neurons with large foundation models.
Our framework is designed to be compatible with various model architectures and datasets, automated and scalable neuron interpretation.
arXiv Detail & Related papers (2023-10-16T17:04:51Z) - Fixed Inter-Neuron Covariability Induces Adversarial Robustness [26.878913741674058]
The vulnerability to adversarial perturbations is a major flaw of Deep Neural Networks (DNNs)
We have developed the Self-Consistent Activation layer, which comprises of neurons whose activations are consistent with each other, as they conform to a fixed, but learned, covariability pattern.
The models with a SCA layer achieved high accuracy, and exhibited significantly greater robustness than multi-layer perceptron models to state-of-the-art Auto-PGD adversarial attacks textitwithout being trained on adversarially perturbed data.
arXiv Detail & Related papers (2023-08-07T23:46:14Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Neural Language Models are not Born Equal to Fit Brain Data, but
Training Helps [75.84770193489639]
We examine the impact of test loss, training corpus and model architecture on the prediction of functional Magnetic Resonance Imaging timecourses of participants listening to an audiobook.
We find that untrained versions of each model already explain significant amount of signal in the brain by capturing similarity in brain responses across identical words.
We suggest good practices for future studies aiming at explaining the human language system using neural language models.
arXiv Detail & Related papers (2022-07-07T15:37:17Z) - Searching for the Essence of Adversarial Perturbations [73.96215665913797]
We show that adversarial perturbations contain human-recognizable information, which is the key conspirator responsible for a neural network's erroneous prediction.
This concept of human-recognizable information allows us to explain key features related to adversarial perturbations.
arXiv Detail & Related papers (2022-05-30T18:04:57Z) - The world seems different in a social context: a neural network analysis
of human experimental data [57.729312306803955]
We show that it is possible to replicate human behavioral data in both individual and social task settings by modifying the precision of prior and sensory signals.
An analysis of the neural activation traces of the trained networks provides evidence that information is coded in fundamentally different ways in the network in the individual and in the social conditions.
arXiv Detail & Related papers (2022-03-03T17:19:12Z) - Neural Networks with Recurrent Generative Feedback [61.90658210112138]
We instantiate this design on convolutional neural networks (CNNs)
In the experiments, CNN-F shows considerably improved adversarial robustness over conventional feedforward CNNs on standard benchmarks.
arXiv Detail & Related papers (2020-07-17T19:32:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.