Robustness to Augmentations as a Generalization metric
- URL: http://arxiv.org/abs/2101.06459v1
- Date: Sat, 16 Jan 2021 15:36:38 GMT
- Title: Robustness to Augmentations as a Generalization metric
- Authors: Sumukh Aithal K, Dhruva Kashyap, Natarajan Subramanyam
- Abstract summary: Generalization is the ability of a model to predict on unseen domains.
We propose a method to predict the generalization performance of a model by using the concept that models that are robust to augmentations are more generalizable than those which are not.
The proposed method was the first runner up solution for the NeurIPS competition on Predicting Generalization in Deep Learning.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalization is the ability of a model to predict on unseen domains and is
a fundamental task in machine learning. Several generalization bounds, both
theoretical and empirical have been proposed but they do not provide tight
bounds .In this work, we propose a simple yet effective method to predict the
generalization performance of a model by using the concept that models that are
robust to augmentations are more generalizable than those which are not. We
experiment with several augmentations and composition of augmentations to check
the generalization capacity of a model. We also provide a detailed motivation
behind the proposed method. The proposed generalization metric is calculated
based on the change in the output of the model after augmenting the input. The
proposed method was the first runner up solution for the NeurIPS competition on
Predicting Generalization in Deep Learning.
Related papers
- Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Understanding Generalization via Set Theory [1.6475699373210055]
Generalization is at the core of machine learning models.
We employ set theory to introduce the concepts of algorithms, hypotheses, and dataset generalization.
arXiv Detail & Related papers (2023-11-11T11:47:29Z) - Aggregation Weighting of Federated Learning via Generalization Bound
Estimation [65.8630966842025]
Federated Learning (FL) typically aggregates client model parameters using a weighting approach determined by sample proportions.
We replace the aforementioned weighting method with a new strategy that considers the generalization bounds of each local model.
arXiv Detail & Related papers (2023-11-10T08:50:28Z) - Boosting Fair Classifier Generalization through Adaptive Priority Reweighing [59.801444556074394]
A performance-promising fair algorithm with better generalizability is needed.
This paper proposes a novel adaptive reweighing method to eliminate the impact of the distribution shifts between training and test data on model generalizability.
arXiv Detail & Related papers (2023-09-15T13:04:55Z) - Learning Expressive Priors for Generalization and Uncertainty Estimation
in Neural Networks [77.89179552509887]
We propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks.
The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees.
We exhaustively show the effectiveness of this method for uncertainty estimation and generalization.
arXiv Detail & Related papers (2023-07-15T09:24:33Z) - Sparsity-aware generalization theory for deep neural networks [12.525959293825318]
We present a new approach to analyzing generalization for deep feed-forward ReLU networks.
We show fundamental trade-offs between sparsity and generalization.
arXiv Detail & Related papers (2023-07-01T20:59:05Z) - Improving Generalization of Pre-trained Language Models via Stochastic
Weight Averaging [25.856435988848638]
Knowledge Distillation (KD) is a commonly used technique for improving the generalization of compact Pre-trained Language Models (PLMs)
We adapt Weight Averaging (SWA), a method encouraging convergence to a flatter minimum, to fine-tune PLMs.
We demonstrate that our adaptation improves the generalization without extra cost.
arXiv Detail & Related papers (2022-12-12T15:09:56Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Generalization Gap in Amortized Inference [17.951010274427187]
We study the generalizations of a popular class of probabilistic models - the Variational Auto-Encoder (VAE)
We show that the over-fitting phenomenon is usually dominated by the amortized inference network.
We propose a new training objective, inspired by the classic wake-sleep algorithm, to improve the generalizations properties of amortized inference.
arXiv Detail & Related papers (2022-05-23T21:28:47Z) - Towards Principled Disentanglement for Domain Generalization [90.9891372499545]
A fundamental challenge for machine learning models is generalizing to out-of-distribution (OOD) data.
We first formalize the OOD generalization problem as constrained optimization, called Disentanglement-constrained Domain Generalization (DDG)
Based on the transformation, we propose a primal-dual algorithm for joint representation disentanglement and domain generalization.
arXiv Detail & Related papers (2021-11-27T07:36:32Z) - End-to-end Neural Coreference Resolution Revisited: A Simple yet
Effective Baseline [20.431647446999996]
We propose a simple yet effective baseline for coreference resolution.
Our model is a simplified version of the original neural coreference resolution model.
Our work provides evidence for the necessity of carefully justifying the complexity of existing or newly proposed models.
arXiv Detail & Related papers (2021-07-04T18:12:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.