Related papers: A Distillation Learning Model of Adaptive Structural Deep Belief Network for AffectNet: Facial Expression Image Database

A Distillation Learning Model of Adaptive Structural Deep Belief Network for AffectNet: Facial Expression Image Database

URL: http://arxiv.org/abs/2110.12717v1
Date: Mon, 25 Oct 2021 08:01:36 GMT
Title: A Distillation Learning Model of Adaptive Structural Deep Belief Network for AffectNet: Facial Expression Image Database
Authors: Takumi Ichimura, Shin Kamada
Abstract summary: We have developed the adaptive structure learning method of Deep Belief Network (DBN) In this paper, our model is applied to a facial expression image data set, AffectNet. The classification accuracy was improved from 78.4% to 91.3% by the proposed method.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Learning has a hierarchical network architecture to represent the complicated feature of input patterns. We have developed the adaptive structure learning method of Deep Belief Network (DBN) that can discover an optimal number of hidden neurons for given input data in a Restricted Boltzmann Machine (RBM) by neuron generation-annihilation algorithm, and can obtain the appropriate number of hidden layers in DBN. In this paper, our model is applied to a facial expression image data set, AffectNet. The system has higher classification capability than the traditional CNN. However, our model was not able to classify some test cases correctly because human emotions contain many ambiguous features or patterns leading wrong answer by two or more annotators who have different subjective judgment for a facial image. In order to represent such cases, this paper investigated a distillation learning model of Adaptive DBN. The original trained model can be seen as a parent model and some child models are trained for some mis-classified cases. For the difference between the parent model and the child one, KL divergence is monitored and then some appropriate new neurons at the parent model are generated according to KL divergence to improve classification accuracy. In this paper, the classification accuracy was improved from 78.4% to 91.3% by the proposed method.

Related papers

An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques [12.470257882838126]
We propose the Diffusion-Based Discriminative Model Enhancement Framework (DBMEF) This framework seamlessly integrates discriminative and generative models in a training-free manner. DBMEF can effectively enhance the classification accuracy and capability of discriminative models in a plug-and-play manner.
arXiv Detail & Related papers (2024-12-12T08:46:22Z)
Neural Lineage [56.34149480207817]
We introduce a novel task known as neural lineage detection, aiming at discovering lineage relationships between parent and child models. For practical convenience, we introduce a learning-free approach, which integrates an approximation of the finetuning process into the neural network representation similarity metrics. For the pursuit of accuracy, we introduce a learning-based lineage detector comprising encoders and a transformer detector.
arXiv Detail & Related papers (2024-06-17T01:11:53Z)
BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [56.9358325168226]
We propose a Bagging deep learning training algorithm based on Efficient Neural network Diffusion (BEND) Our approach is simple but effective, first using multiple trained model weights and biases as inputs to train autoencoder and latent diffusion model. Our proposed BEND algorithm can consistently outperform the mean and median accuracies of both the original trained model and the diffused model.
arXiv Detail & Related papers (2024-03-23T08:40:38Z)
On the Steganographic Capacity of Selected Learning Models [1.0640226829362012]
We consider the question of the steganographic capacity of learning models. For a wide range of models, we determine the number of low-order bits that can be overwritten. Of the models tested, the steganographic capacity ranges from 7.04 KB for our LR experiments, to 44.74 MB for InceptionV3.
arXiv Detail & Related papers (2023-08-29T10:41:34Z)
Steganographic Capacity of Deep Learning Models [12.974139332068491]
We consider the steganographic capacity of several learning models. We train a Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), and Transformer model on a challenging malware classification problem. We find that the steganographic capacity of the learning models tested is surprisingly high, and that in each case, there is a clear threshold after which model performance rapidly degrades.
arXiv Detail & Related papers (2023-06-25T13:43:35Z)
Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling [69.60713300418467]
Learning to jump is a general recipe for generative modeling of various types of data. We demonstrate when learning to jump is expected to perform comparably to learning to denoise, and when it is expected to perform better.
arXiv Detail & Related papers (2023-05-28T05:38:28Z)
Multiclass Semantic Segmentation to Identify Anatomical Sub-Regions of Brain and Measure Neuronal Health in Parkinson's Disease [2.288652563296735]
Currently, a machine learning model to analyze sub-anatomical regions of the brain to analyze 2D histological images is not available. In this study, we trained our best fit model on approximately one thousand annotated 2D brain images stained with Nissl/ Haematoxylin and Tyrosine Hydroxylase enzyme (TH, indicator of dopaminergic neuron viability) The model effectively is able to detect two sub-regions compacta (SNCD) and reticulata (SNr) in all the images.
arXiv Detail & Related papers (2023-01-07T19:35:28Z)
Do We Really Need a Learnable Classifier at the End of Deep Neural Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training. Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z)
Multi network InfoMax: A pre-training method involving graph convolutional networks [0.0]
This paper presents a pre-training method involving graph convolutional/neural networks (GCNs/GNNs) The learned high-level graph latent representations help increase performance for downstream graph classification tasks. We apply our method to a neuroimaging dataset for classifying subjects into healthy control (HC) and schizophrenia (SZ) groups.
arXiv Detail & Related papers (2021-11-01T21:53:20Z)
An Embedded System for Image-based Crack Detection by using Fine-Tuning model of Adaptive Structural Learning of Deep Belief Network [0.0]
An adaptive structural learning method of Restricted Boltzmann Machine (Adaptive RBM) and Deep Belief Network (Adaptive DBN) have been developed. The proposed method was applied to a concrete image benchmark data set SDNET 2018 for crack detection. In this paper, our developed Adaptive DBN was embedded to a tiny PC with GPU for real-time inference on a drone.
arXiv Detail & Related papers (2021-10-25T07:28:50Z)
The Causal Neural Connection: Expressiveness, Learnability, and Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation. In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models. We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z)
How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking [70.92463223410225]
DiffMask learns to mask-out subsets of the input while maintaining differentiability. Decision to include or disregard an input token is made with a simple model based on intermediate hidden layers. This lets us not only plot attribution heatmaps but also analyze how decisions are formed across network layers.
arXiv Detail & Related papers (2020-04-30T17:36:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.