Domain Generalization Guided by Gradient Signal to Noise Ratio of
Parameters
- URL: http://arxiv.org/abs/2310.07361v1
- Date: Wed, 11 Oct 2023 10:21:34 GMT
- Title: Domain Generalization Guided by Gradient Signal to Noise Ratio of
Parameters
- Authors: Mateusz Michalkiewicz, Masoud Faraki, Xiang Yu, Manmohan Chandraker,
Mahsa Baktashmotlagh
- Abstract summary: Overfitting to the source domain is a common issue in gradient-based training of deep neural networks.
We propose to base the selection on gradient-signal-to-noise ratio (GSNR) of network's parameters.
- Score: 69.24377241408851
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Overfitting to the source domain is a common issue in gradient-based training
of deep neural networks. To compensate for the over-parameterized models,
numerous regularization techniques have been introduced such as those based on
dropout. While these methods achieve significant improvements on classical
benchmarks such as ImageNet, their performance diminishes with the introduction
of domain shift in the test set i.e. when the unseen data comes from a
significantly different distribution. In this paper, we move away from the
classical approach of Bernoulli sampled dropout mask construction and propose
to base the selection on gradient-signal-to-noise ratio (GSNR) of network's
parameters. Specifically, at each training step, parameters with high GSNR will
be discarded. Furthermore, we alleviate the burden of manually searching for
the optimal dropout ratio by leveraging a meta-learning approach. We evaluate
our method on standard domain generalization benchmarks and achieve competitive
results on classification and face anti-spoofing problems.
Related papers
- Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling [9.20186865054847]
Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems.
This work considers AD in network flows using incomplete measurements.
We propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective.
Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics.
arXiv Detail & Related papers (2024-09-17T19:59:57Z) - An Adaptive Cost-Sensitive Learning and Recursive Denoising Framework for Imbalanced SVM Classification [12.986535715303331]
Category imbalance is one of the most popular and important issues in the domain of classification.
Emotion classification model trained on imbalanced datasets easily leads to unreliable prediction.
arXiv Detail & Related papers (2024-03-13T09:43:14Z) - Activate and Reject: Towards Safe Domain Generalization under Category
Shift [71.95548187205736]
We study a practical problem of Domain Generalization under Category Shift (DGCS)
It aims to simultaneously detect unknown-class samples and classify known-class samples in the target domains.
Compared to prior DG works, we face two new challenges: 1) how to learn the concept of unknown'' during training with only source known-class samples, and 2) how to adapt the source-trained model to unseen environments.
arXiv Detail & Related papers (2023-10-07T07:53:12Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - A Distributed Optimisation Framework Combining Natural Gradient with
Hessian-Free for Discriminative Sequence Training [16.83036203524611]
This paper presents a novel natural gradient and Hessian-free (NGHF) optimisation framework for neural network training.
It relies on the linear conjugate gradient (CG) algorithm to combine the natural gradient (NG) method with local curvature information from Hessian-free (HF) or other second-order methods.
Experiments are reported on the multi-genre broadcast data set for a range of different acoustic model types.
arXiv Detail & Related papers (2021-03-12T22:18:34Z) - On Signal-to-Noise Ratio Issues in Variational Inference for Deep
Gaussian Processes [55.62520135103578]
We show that the gradient estimates used in training Deep Gaussian Processes (DGPs) with importance-weighted variational inference are susceptible to signal-to-noise ratio (SNR) issues.
We show that our fix can lead to consistent improvements in the predictive performance of DGP models.
arXiv Detail & Related papers (2020-11-01T14:38:02Z) - Anomaly detection with variational quantum generative adversarial
networks [0.0]
Generative adversarial networks (GANs) are a machine learning framework comprising a generative model for sampling from a target distribution.
We introduce variational quantum-classical Wasserstein GANs to address these issues and embed this model in a classical machine learning framework for anomaly detection.
Our model replaces the generator of Wasserstein GANs with a hybrid quantum-classical neural net and leaves the classical discriminative model unchanged.
arXiv Detail & Related papers (2020-10-20T17:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.