MENTOR: Human Perception-Guided Pretraining for Increased Generalization
- URL: http://arxiv.org/abs/2310.19545v2
- Date: Mon, 12 Feb 2024 17:04:46 GMT
- Title: MENTOR: Human Perception-Guided Pretraining for Increased Generalization
- Authors: Colton R. Crum, Adam Czajka
- Abstract summary: We introduce MENTOR (huMan pErceptioN-guided preTraining fOr increased geneRalization)
We train an autoencoder to learn human saliency maps given an input image, without class labels.
We remove the decoder part, add a classification layer on top of the encoder, and fine-tune this new model conventionally.
- Score: 5.596752018167751
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Incorporating human perception into training of convolutional neural networks
(CNN) has boosted generalization capabilities of such models in open-set
recognition tasks. One of the active research questions is where (in the model
architecture) and how to efficiently incorporate always-limited human
perceptual data into training strategies of models. In this paper, we introduce
MENTOR (huMan pErceptioN-guided preTraining fOr increased geneRalization),
which addresses this question through two unique rounds of training the CNNs
tasked with open-set anomaly detection. First, we train an autoencoder to learn
human saliency maps given an input image, without class labels. The autoencoder
is thus tasked with discovering domain-specific salient features which mimic
human perception. Second, we remove the decoder part, add a classification
layer on top of the encoder, and fine-tune this new model conventionally. We
show that MENTOR's benefits are twofold: (a) significant accuracy boost in
anomaly detection tasks (in this paper demonstrated for detection of unknown
iris presentation attacks, synthetically-generated faces, and anomalies in
chest X-ray images), compared to models utilizing conventional transfer
learning (e.g., sourcing the weights from ImageNet-pretrained models) as well
as to models trained with the state-of-the-art approach incorporating human
perception guidance into loss functions, and (b) an increase in the efficiency
of model training, requiring fewer epochs to converge compared to
state-of-the-art training methods.
Related papers
- Training Better Deep Learning Models Using Human Saliency [11.295653130022156]
This work explores how human judgement about salient regions of an image can be introduced into deep convolutional neural network (DCNN) training.
We propose a new component of the loss function that ConveYs Brain Oversight to Raise Generalization (CYBORG) and penalizes the model for using non-salient regions.
arXiv Detail & Related papers (2024-10-21T16:52:44Z) - Defect Classification in Additive Manufacturing Using CNN-Based Vision
Processing [76.72662577101988]
This paper examines two scenarios: first, using convolutional neural networks (CNNs) to accurately classify defects in an image dataset from AM and second, applying active learning techniques to the developed classification model.
This allows the construction of a human-in-the-loop mechanism to reduce the size of the data required to train and generate training data.
arXiv Detail & Related papers (2023-07-14T14:36:58Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - EfficientTrain: Exploring Generalized Curriculum Learning for Training
Visual Backbones [80.662250618795]
This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers)
As an off-the-shelf method, it reduces the wall-time training cost of a wide variety of popular models by >1.5x on ImageNet-1K/22K without sacrificing accuracy.
arXiv Detail & Related papers (2022-11-17T17:38:55Z) - Reconciliation of Pre-trained Models and Prototypical Neural Networks in
Few-shot Named Entity Recognition [35.34238362639678]
We propose a one-line-code normalization method to reconcile such a mismatch with empirical and theoretical grounds.
Our work also provides an analytical viewpoint for addressing the general problems in few-shot name entity recognition.
arXiv Detail & Related papers (2022-11-07T02:33:45Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - With Greater Distance Comes Worse Performance: On the Perspective of
Layer Utilization and Model Generalization [3.6321778403619285]
Generalization of deep neural networks remains one of the main open problems in machine learning.
Early layers generally learn representations relevant to performance on both training data and testing data.
Deeper layers only minimize training risks and fail to generalize well with testing or mislabeled data.
arXiv Detail & Related papers (2022-01-28T05:26:32Z) - CYBORG: Blending Human Saliency Into the Loss Improves Deep Learning [5.092711491848192]
This paper proposes a first-ever training strategy to ConveY Brain Oversight to Raise Generalization (CYBORG)
New training approach incorporates human-annotated saliency maps into a CYBORG loss function that guides the model towards learning features from image regions that humans find salient when solving a given visual task.
Results on the task of synthetic face detection show that the CYBORG loss leads to significant improvement in performance on unseen samples consisting of face images generated from six Generative Adversarial Networks (GANs) across multiple classification network architectures.
arXiv Detail & Related papers (2021-12-01T18:04:15Z) - Adversarially-Trained Deep Nets Transfer Better: Illustration on Image
Classification [53.735029033681435]
Transfer learning is a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains.
In this work, we demonstrate that adversarially-trained models transfer better than non-adversarially-trained models.
arXiv Detail & Related papers (2020-07-11T22:48:42Z) - Feature Purification: How Adversarial Training Performs Robust Deep
Learning [66.05472746340142]
We show a principle that we call Feature Purification, where we show one of the causes of the existence of adversarial examples is the accumulation of certain small dense mixtures in the hidden weights during the training process of a neural network.
We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a theoretical result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly gradient descent indeed this principle.
arXiv Detail & Related papers (2020-05-20T16:56:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.