A Distillation Learning Model of Adaptive Structural Deep Belief Network
for AffectNet: Facial Expression Image Database
- URL: http://arxiv.org/abs/2110.12717v1
- Date: Mon, 25 Oct 2021 08:01:36 GMT
- Title: A Distillation Learning Model of Adaptive Structural Deep Belief Network
for AffectNet: Facial Expression Image Database
- Authors: Takumi Ichimura, Shin Kamada
- Abstract summary: We have developed the adaptive structure learning method of Deep Belief Network (DBN)
In this paper, our model is applied to a facial expression image data set, AffectNet.
The classification accuracy was improved from 78.4% to 91.3% by the proposed method.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Learning has a hierarchical network architecture to represent the
complicated feature of input patterns. We have developed the adaptive structure
learning method of Deep Belief Network (DBN) that can discover an optimal
number of hidden neurons for given input data in a Restricted Boltzmann Machine
(RBM) by neuron generation-annihilation algorithm, and can obtain the
appropriate number of hidden layers in DBN. In this paper, our model is applied
to a facial expression image data set, AffectNet. The system has higher
classification capability than the traditional CNN. However, our model was not
able to classify some test cases correctly because human emotions contain many
ambiguous features or patterns leading wrong answer by two or more annotators
who have different subjective judgment for a facial image. In order to
represent such cases, this paper investigated a distillation learning model of
Adaptive DBN. The original trained model can be seen as a parent model and some
child models are trained for some mis-classified cases. For the difference
between the parent model and the child one, KL divergence is monitored and then
some appropriate new neurons at the parent model are generated according to KL
divergence to improve classification accuracy. In this paper, the
classification accuracy was improved from 78.4% to 91.3% by the proposed
method.
Related papers
- Neural Lineage [56.34149480207817]
We introduce a novel task known as neural lineage detection, aiming at discovering lineage relationships between parent and child models.
For practical convenience, we introduce a learning-free approach, which integrates an approximation of the finetuning process into the neural network representation similarity metrics.
For the pursuit of accuracy, we introduce a learning-based lineage detector comprising encoders and a transformer detector.
arXiv Detail & Related papers (2024-06-17T01:11:53Z) - BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [56.9358325168226]
We propose a Bagging deep learning training algorithm based on Efficient Neural network Diffusion (BEND)
Our approach is simple but effective, first using multiple trained model weights and biases as inputs to train autoencoder and latent diffusion model.
Our proposed BEND algorithm can consistently outperform the mean and median accuracies of both the original trained model and the diffused model.
arXiv Detail & Related papers (2024-03-23T08:40:38Z) - On the Steganographic Capacity of Selected Learning Models [1.0640226829362012]
We consider the question of the steganographic capacity of learning models.
For a wide range of models, we determine the number of low-order bits that can be overwritten.
Of the models tested, the steganographic capacity ranges from 7.04 KB for our LR experiments, to 44.74 MB for InceptionV3.
arXiv Detail & Related papers (2023-08-29T10:41:34Z) - Steganographic Capacity of Deep Learning Models [12.974139332068491]
We consider the steganographic capacity of several learning models.
We train a Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), and Transformer model on a challenging malware classification problem.
We find that the steganographic capacity of the learning models tested is surprisingly high, and that in each case, there is a clear threshold after which model performance rapidly degrades.
arXiv Detail & Related papers (2023-06-25T13:43:35Z) - Learning to Jump: Thinning and Thickening Latent Counts for Generative
Modeling [69.60713300418467]
Learning to jump is a general recipe for generative modeling of various types of data.
We demonstrate when learning to jump is expected to perform comparably to learning to denoise, and when it is expected to perform better.
arXiv Detail & Related papers (2023-05-28T05:38:28Z) - Multiclass Semantic Segmentation to Identify Anatomical Sub-Regions of
Brain and Measure Neuronal Health in Parkinson's Disease [2.288652563296735]
Currently, a machine learning model to analyze sub-anatomical regions of the brain to analyze 2D histological images is not available.
In this study, we trained our best fit model on approximately one thousand annotated 2D brain images stained with Nissl/ Haematoxylin and Tyrosine Hydroxylase enzyme (TH, indicator of dopaminergic neuron viability)
The model effectively is able to detect two sub-regions compacta (SNCD) and reticulata (SNr) in all the images.
arXiv Detail & Related papers (2023-01-07T19:35:28Z) - Do We Really Need a Learnable Classifier at the End of Deep Neural
Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training.
Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z) - Multi network InfoMax: A pre-training method involving graph
convolutional networks [0.0]
This paper presents a pre-training method involving graph convolutional/neural networks (GCNs/GNNs)
The learned high-level graph latent representations help increase performance for downstream graph classification tasks.
We apply our method to a neuroimaging dataset for classifying subjects into healthy control (HC) and schizophrenia (SZ) groups.
arXiv Detail & Related papers (2021-11-01T21:53:20Z) - An Embedded System for Image-based Crack Detection by using Fine-Tuning
model of Adaptive Structural Learning of Deep Belief Network [0.0]
An adaptive structural learning method of Restricted Boltzmann Machine (Adaptive RBM) and Deep Belief Network (Adaptive DBN) have been developed.
The proposed method was applied to a concrete image benchmark data set SDNET 2018 for crack detection.
In this paper, our developed Adaptive DBN was embedded to a tiny PC with GPU for real-time inference on a drone.
arXiv Detail & Related papers (2021-10-25T07:28:50Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - How do Decisions Emerge across Layers in Neural Models? Interpretation
with Differentiable Masking [70.92463223410225]
DiffMask learns to mask-out subsets of the input while maintaining differentiability.
Decision to include or disregard an input token is made with a simple model based on intermediate hidden layers.
This lets us not only plot attribution heatmaps but also analyze how decisions are formed across network layers.
arXiv Detail & Related papers (2020-04-30T17:36:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.