Using Human-like Mechanism to Weaken Effect of Pre-training Weight Bias
in Face-Recognition Convolutional Neural Network
- URL: http://arxiv.org/abs/2310.13674v1
- Date: Fri, 20 Oct 2023 17:22:57 GMT
- Title: Using Human-like Mechanism to Weaken Effect of Pre-training Weight Bias
in Face-Recognition Convolutional Neural Network
- Authors: Haojiang Ying, Yi-Fan Li, Yiyang Chen
- Abstract summary: We focus on 4 extensively studied CNNs (AlexNet, VGG11, VGG13, and VGG16) which has been analyzed as human-like models by neuroscientists.
We trained these CNNs to emotion valence classification task by transfer learning.
We then update the object-based AlexNet using self-attention mechanism based on neuroscience and behavioral data.
- Score: 6.0950431324191845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional neural network (CNN), as an important model in artificial
intelligence, has been widely used and studied in different disciplines. The
computational mechanisms of CNNs are still not fully revealed due to the their
complex nature. In this study, we focused on 4 extensively studied CNNs
(AlexNet, VGG11, VGG13, and VGG16) which has been analyzed as human-like models
by neuroscientists with ample evidence. We trained these CNNs to emotion
valence classification task by transfer learning. Comparing their performance
with human data, the data unveiled that these CNNs would partly perform as
human does. We then update the object-based AlexNet using self-attention
mechanism based on neuroscience and behavioral data. The updated FE-AlexNet
outperformed all the other tested CNNs and closely resembles human perception.
The results further unveil the computational mechanisms of these CNNs.
Moreover, this study offers a new paradigm to better understand and improve CNN
performance via human data.
Related papers
- Transferability of Convolutional Neural Networks in Stationary Learning
Tasks [96.00428692404354]
We introduce a novel framework for efficient training of convolutional neural networks (CNNs) for large-scale spatial problems.
We show that a CNN trained on small windows of such signals achieves a nearly performance on much larger windows without retraining.
Our results show that the CNN is able to tackle problems with many hundreds of agents after being trained with fewer than ten.
arXiv Detail & Related papers (2023-07-21T13:51:45Z) - Improving the Accuracy and Robustness of CNNs Using a Deep CCA Neural
Data Regularizer [2.026424957803652]
As convolutional neural networks (CNNs) become more accurate at object recognition, their representations become more similar to the primate visual system.
Previous attempts to address this question showed very modest gains in accuracy, owing in part to limitations of the regularization method.
We develop a new neural data regularizer for CNNs that uses Deep Correlation Analysis (DCCA) to optimize the resemblance of the CNN's image representations to that of the monkey visual cortex.
arXiv Detail & Related papers (2022-09-06T15:40:39Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Controlled-rearing studies of newborn chicks and deep neural networks [0.0]
Convolutional neural networks (CNNs) can achieve human-level performance on challenging object recognition tasks.
CNNs are thought to be "data hungry," requiring massive amounts of training data to develop accurate models for object recognition.
This critique challenges the promise of using CNNs as models of visual development.
arXiv Detail & Related papers (2021-12-12T00:45:07Z) - Overcoming the Domain Gap in Neural Action Representations [60.47807856873544]
3D pose data can now be reliably extracted from multi-view video sequences without manual intervention.
We propose to use it to guide the encoding of neural action representations together with a set of neural and behavioral augmentations.
To reduce the domain gap, during training, we swap neural and behavioral data across animals that seem to be performing similar actions.
arXiv Detail & Related papers (2021-12-02T12:45:46Z) - Assessing learned features of Deep Learning applied to EEG [0.0]
We use 3 different methods to extract EEG-relevant features from a CNN trained on raw EEG data.
We show that visualization of a CNN model can reveal interesting EEG results.
arXiv Detail & Related papers (2021-11-08T07:43:40Z) - Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks.
This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy.
Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z) - BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by
Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks.
Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z) - The shape and simplicity biases of adversarially robust ImageNet-trained
CNNs [9.707679445925516]
We study the shape bias and internal mechanisms that enable the generalizability of AlexNet, GoogLeNet, and ResNet-50 models trained via adversarial training.
Remarkably, adversarial training induces three simplicity biases into hidden neurons in the process of "robustifying" CNNs.
arXiv Detail & Related papers (2020-06-16T16:38:16Z) - An Information-theoretic Visual Analysis Framework for Convolutional
Neural Networks [11.15523311079383]
We introduce a data model to organize the data that can be extracted from CNN models.
We then propose two ways to calculate entropy under different circumstances.
We develop a visual analysis system, CNNSlicer, to interactively explore the amount of information changes inside the model.
arXiv Detail & Related papers (2020-05-02T21:36:50Z) - Neural Additive Models: Interpretable Machine Learning with Neural Nets [77.66871378302774]
Deep neural networks (DNNs) are powerful black-box predictors that have achieved impressive performance on a wide variety of tasks.
We propose Neural Additive Models (NAMs) which combine some of the expressivity of DNNs with the inherent intelligibility of generalized additive models.
NAMs learn a linear combination of neural networks that each attend to a single input feature.
arXiv Detail & Related papers (2020-04-29T01:28:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.