TANGOS: Regularizing Tabular Neural Networks through Gradient
Orthogonalization and Specialization
- URL: http://arxiv.org/abs/2303.05506v1
- Date: Thu, 9 Mar 2023 18:57:13 GMT
- Title: TANGOS: Regularizing Tabular Neural Networks through Gradient
Orthogonalization and Specialization
- Authors: Alan Jeffares, Tennison Liu, Jonathan Crabb\'e, Fergus Imrie, Mihaela
van der Schaar
- Abstract summary: We introduce Tabular Neural Gradient Orthogonalization and gradient (TANGOS)
TANGOS is a novel framework for regularization in the tabular setting built on latent unit attributions.
We demonstrate that our approach can lead to improved out-of-sample generalization performance, outperforming other popular regularization methods.
- Score: 69.80141512683254
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite their success with unstructured data, deep neural networks are not
yet a panacea for structured tabular data. In the tabular domain, their
efficiency crucially relies on various forms of regularization to prevent
overfitting and provide strong generalization performance. Existing
regularization techniques include broad modelling decisions such as choice of
architecture, loss functions, and optimization methods. In this work, we
introduce Tabular Neural Gradient Orthogonalization and Specialization
(TANGOS), a novel framework for regularization in the tabular setting built on
latent unit attributions. The gradient attribution of an activation with
respect to a given input feature suggests how the neuron attends to that
feature, and is often employed to interpret the predictions of deep networks.
In TANGOS, we take a different approach and incorporate neuron attributions
directly into training to encourage orthogonalization and specialization of
latent attributions in a fully-connected network. Our regularizer encourages
neurons to focus on sparse, non-overlapping input features and results in a set
of diverse and specialized latent units. In the tabular domain, we demonstrate
that our approach can lead to improved out-of-sample generalization
performance, outperforming other popular regularization methods. We provide
insight into why our regularizer is effective and demonstrate that TANGOS can
be applied jointly with existing methods to achieve even greater generalization
performance.
Related papers
- Back to Bayesics: Uncovering Human Mobility Distributions and Anomalies with an Integrated Statistical and Neural Framework [14.899157568336731]
DeepBayesic is a novel framework that integrates Bayesian principles with deep neural networks to model the underlying distributions.
We evaluate our approach on several mobility datasets, demonstrating significant improvements over state-of-the-art anomaly detection methods.
arXiv Detail & Related papers (2024-10-01T19:02:06Z) - Joint Diffusion Processes as an Inductive Bias in Sheaf Neural Networks [14.224234978509026]
Sheaf Neural Networks (SNNs) naturally extend Graph Neural Networks (GNNs)
We propose two novel sheaf learning approaches that provide a more intuitive understanding of the involved structure maps.
In our evaluation, we show the limitations of the real-world benchmarks used so far on SNNs.
arXiv Detail & Related papers (2024-07-30T07:17:46Z) - Function-Space Regularization in Neural Networks: A Probabilistic
Perspective [51.133793272222874]
We show that we can derive a well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training.
We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection and highly-calibrated predictive uncertainty estimates.
arXiv Detail & Related papers (2023-12-28T17:50:56Z) - Learning Expressive Priors for Generalization and Uncertainty Estimation
in Neural Networks [77.89179552509887]
We propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks.
The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees.
We exhaustively show the effectiveness of this method for uncertainty estimation and generalization.
arXiv Detail & Related papers (2023-07-15T09:24:33Z) - Evolving Neural Selection with Adaptive Regularization [7.298440208725654]
We show a method in which the selection of neurons in deep neural networks evolves, adapting to the difficulty of prediction.
We propose the Adaptive Neural Selection (ANS) framework, which evolves to weigh neurons in a layer to form network variants.
Experimental results show that the proposed method can significantly improve the performance of commonly-used neural network architectures.
arXiv Detail & Related papers (2022-04-04T17:19:52Z) - Implicit Regularization in Hierarchical Tensor Factorization and Deep
Convolutional Neural Networks [18.377136391055327]
This paper theoretically analyzes the implicit regularization in hierarchical tensor factorization.
It translates to an implicit regularization towards locality for the associated convolutional networks.
Our work highlights the potential of enhancing neural networks via theoretical analysis of their implicit regularization.
arXiv Detail & Related papers (2022-01-27T18:48:30Z) - Embracing the Dark Knowledge: Domain Generalization Using Regularized
Knowledge Distillation [65.79387438988554]
Lack of generalization capability in the absence of sufficient and representative data is one of the challenges that hinder their practical application.
We propose a simple, effective, and plug-and-play training strategy named Knowledge Distillation for Domain Generalization (KDDG)
We find that both the richer dark knowledge" from the teacher network, as well as the gradient filter we proposed, can reduce the difficulty of learning the mapping.
arXiv Detail & Related papers (2021-07-06T14:08:54Z) - Learning for Integer-Constrained Optimization through Neural Networks
with Limited Training [28.588195947764188]
We introduce a symmetric and decomposed neural network structure, which is fully interpretable in terms of the functionality of its constituent components.
By taking advantage of the underlying pattern of the integer constraint, the introduced neural network offers superior generalization performance with limited training.
We show that the introduced decomposed approach can be further extended to semi-decomposed frameworks.
arXiv Detail & Related papers (2020-11-10T21:17:07Z) - On Connections between Regularizations for Improving DNN Robustness [67.28077776415724]
This paper analyzes regularization terms proposed recently for improving the adversarial robustness of deep neural networks (DNNs)
We study possible connections between several effective methods, including input-gradient regularization, Jacobian regularization, curvature regularization, and a cross-Lipschitz functional.
arXiv Detail & Related papers (2020-07-04T23:43:32Z) - Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective.
We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.