Related papers: Training Better Deep Learning Models Using Human Saliency

Training Better Deep Learning Models Using Human Saliency

URL: http://arxiv.org/abs/2410.16190v1
Date: Mon, 21 Oct 2024 16:52:44 GMT
Title: Training Better Deep Learning Models Using Human Saliency
Authors: Aidan Boyd, Patrick Tinsley, Kevin W. Bowyer, Adam Czajka,
Abstract summary: This work explores how human judgement about salient regions of an image can be introduced into deep convolutional neural network (DCNN) training. We propose a new component of the loss function that ConveYs Brain Oversight to Raise Generalization (CYBORG) and penalizes the model for using non-salient regions.
Score: 11.295653130022156
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This work explores how human judgement about salient regions of an image can be introduced into deep convolutional neural network (DCNN) training. Traditionally, training of DCNNs is purely data-driven. This often results in learning features of the data that are only coincidentally correlated with class labels. Human saliency can guide network training using our proposed new component of the loss function that ConveYs Brain Oversight to Raise Generalization (CYBORG) and penalizes the model for using non-salient regions. This mechanism produces DCNNs achieving higher accuracy and generalization compared to using the same training data without human salience. Experimental results demonstrate that CYBORG applies across multiple network architectures and problem domains (detection of synthetic faces, iris presentation attacks and anomalies in chest X-rays), while requiring significantly less data than training without human saliency guidance. Visualizations show that CYBORG-trained models' saliency is more consistent across independent training runs than traditionally-trained models, and also in better agreement with humans. To lower the cost of collecting human annotations, we also explore using deep learning to provide automated annotations. CYBORG training of CNNs addresses important issues such as reducing the appetite for large training sets, increasing interpretability, and reducing fragility by generalizing better to new types of data.

Related papers

Bayesian Topological Convolutional Neural Nets [0.5985483103102681]
Convolutional neural networks (CNNs) have been established as the main workhorse in image data processing.<n>We propose a new Bayesian topological CNN that promotes a novel interplay between topology-aware learning and Bayesian sampling.<n>We evaluate the model on benchmark image classification datasets and demonstrate its superiority over conventional CNNs, Bayesian neural networks (BNNs), and topological CNNs.
arXiv Detail & Related papers (2025-10-13T17:57:43Z)
Stealing Training Graphs from Graph Neural Networks [54.52392250297907]
Graph Neural Networks (GNNs) have shown promising results in modeling graphs in various tasks. As neural networks can memorize the training samples, the model parameters of GNNs have a high risk of leaking private training data. We investigate a novel problem of stealing graphs from trained GNNs.
arXiv Detail & Related papers (2024-11-17T23:15:36Z)
MENTOR: Human Perception-Guided Pretraining for Increased Generalization [5.596752018167751]
We introduce MENTOR (huMan pErceptioN-guided preTraining fOr increased geneRalization) We train an autoencoder to learn human saliency maps given an input image, without class labels. We remove the decoder part, add a classification layer on top of the encoder, and fine-tune this new model conventionally.
arXiv Detail & Related papers (2023-10-30T13:50:44Z)
Relearning Forgotten Knowledge: on Forgetting, Overfit and Training-Free Ensembles of DNNs [9.010643838773477]
We introduce a novel score for quantifying overfit, which monitors the forgetting rate of deep models on validation data. We show that overfit can occur with and without a decrease in validation accuracy, and may be more common than previously appreciated. We use our observations to construct a new ensemble method, based solely on the training history of a single network, which provides significant improvement without any additional cost in training time.
arXiv Detail & Related papers (2023-10-17T09:22:22Z)
Label Deconvolution for Node Representation Learning on Large-scale Attributed Graphs against Learning Bias [75.44877675117749]
We propose an efficient label regularization technique, namely Label Deconvolution (LD), to alleviate the learning bias by a novel and highly scalable approximation to the inverse mapping of GNNs. Experiments demonstrate LD significantly outperforms state-of-the-art methods on Open Graph datasets Benchmark.
arXiv Detail & Related papers (2023-09-26T13:09:43Z)
Learn, Unlearn and Relearn: An Online Learning Paradigm for Deep Neural Networks [12.525959293825318]
We introduce Learn, Unlearn, and Relearn (LURE) an online learning paradigm for deep neural networks (DNNs) LURE interchanges between the unlearning phase, which selectively forgets the undesirable information in the model, and the relearning phase, which emphasizes learning on generalizable features. We show that our training paradigm provides consistent performance gains across datasets in both classification and few-shot settings.
arXiv Detail & Related papers (2023-03-18T16:45:54Z)
DCLP: Neural Architecture Predictor with Curriculum Contrastive Learning [5.2319020651074215]
We propose a Curricumum-guided Contrastive Learning framework for neural Predictor (DCLP) Our method simplifies the contrastive task by designing a novel curriculum to enhance the stability of unlabeled training data distribution. We experimentally demonstrate that DCLP has high accuracy and efficiency compared with existing predictors.
arXiv Detail & Related papers (2023-02-25T08:16:21Z)
Adversarial training with informed data selection [53.19381941131439]
Adrial training is the most efficient solution to defend the network against these malicious attacks. This work proposes a data selection strategy to be applied in the mini-batch training. The simulation results show that a good compromise can be obtained regarding robustness and standard accuracy.
arXiv Detail & Related papers (2023-01-07T12:09:50Z)
Neural networks trained with SGD learn distributions of increasing complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics. We then exploit higher-order statistics only later during training. We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z)
On-Device Domain Generalization [93.79736882489982]
Domain generalization is critical to on-device machine learning applications. We find that knowledge distillation is a strong candidate for solving the problem. We propose a simple idea called out-of-distribution knowledge distillation (OKD), which aims to teach the student how the teacher handles (synthetic) out-of-distribution data.
arXiv Detail & Related papers (2022-09-15T17:59:31Z)
Efficient Augmentation for Imbalanced Deep Learning [8.38844520504124]
We study a convolutional neural network's internal representation of imbalanced image data. We measure the generalization gap between a model's feature embeddings in the training and test sets, showing that the gap is wider for minority classes. This insight enables us to design an efficient three-phase CNN training framework for imbalanced data.
arXiv Detail & Related papers (2022-07-13T09:43:17Z)
One Representative-Shot Learning Using a Population-Driven Template with Application to Brain Connectivity Classification and Evolution Prediction [0.0]
Graph neural networks (GNNs) have been introduced to the field of network neuroscience. We take a very different approach in training GNNs, where we aim to learn with one sample and achieve the best performance. We present the first one-shot paradigm where a GNN is trained on a single population-driven template.
arXiv Detail & Related papers (2021-10-06T08:36:00Z)
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution. Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs. Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.