A Free Lunch with Influence Functions? Improving Neural Network
Estimates with Concepts from Semiparametric Statistics
- URL: http://arxiv.org/abs/2202.09096v1
- Date: Fri, 18 Feb 2022 09:35:51 GMT
- Title: A Free Lunch with Influence Functions? Improving Neural Network
Estimates with Concepts from Semiparametric Statistics
- Authors: Matthew J. Vowels and Sina Akbari and Jalal Etesami and Necati Cihan
Camgoz and Richard Bowden
- Abstract summary: We explore the potential for semiparametric theory to be used to improve neural networks and machine learning algorithms.
We propose a new neural network method MultiNet, which seeks the flexibility and diversity of an ensemble using a single architecture.
- Score: 41.99023989695363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Parameter estimation in the empirical fields is usually undertaken using
parametric models, and such models are convenient because they readily
facilitate statistical inference. Unfortunately, they are unlikely to have a
sufficiently flexible functional form to be able to adequately model real-world
phenomena, and their usage may therefore result in biased estimates and invalid
inference. Unfortunately, whilst non-parametric machine learning models may
provide the needed flexibility to adapt to the complexity of real-world
phenomena, they do not readily facilitate statistical inference, and may still
exhibit residual bias. We explore the potential for semiparametric theory (in
particular, the Influence Function) to be used to improve neural networks and
machine learning algorithms in terms of (a) improving initial estimates without
needing more data (b) increasing the robustness of our models, and (c) yielding
confidence intervals for statistical inference. We propose a new neural network
method MultiNet, which seeks the flexibility and diversity of an ensemble using
a single architecture. Results on causal inference tasks indicate that MultiNet
yields better performance than other approaches, and that all considered
methods are amenable to improvement from semiparametric techniques under
certain conditions. In other words, with these techniques we show that we can
improve existing neural networks for `free', without needing more data, and
without needing to retrain them. Finally, we provide the expression for
deriving influence functions for estimands from a general graph, and the code
to do so automatically.
Related papers
- CF-GO-Net: A Universal Distribution Learner via Characteristic Function Networks with Graph Optimizers [8.816637789605174]
We introduce an approach which employs the characteristic function (CF), a probabilistic descriptor that directly corresponds to the distribution.
Unlike the probability density function (pdf), the characteristic function not only always exists, but also provides an additional degree of freedom.
Our method allows the use of a pre-trained model, such as a well-trained autoencoder, and is capable of learning directly in its feature space.
arXiv Detail & Related papers (2024-09-19T09:33:12Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Single-model uncertainty quantification in neural network potentials
does not consistently outperform model ensembles [0.7499722271664145]
Neural networks (NNs) often assign high confidence to their predictions, even for points far out-of-distribution.
Uncertainty quantification (UQ) is a challenge when they are employed to model interatomic potentials in materials systems.
Differentiable UQ techniques can find new informative data and drive active learning loops for robust potentials.
arXiv Detail & Related papers (2023-05-02T19:41:17Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - Variational Hierarchical Mixtures for Probabilistic Learning of Inverse
Dynamics [20.953728061894044]
Well-calibrated probabilistic regression models are a crucial learning component in robotics applications as datasets grow rapidly and tasks become more complex.
We consider a probabilistic hierarchical modeling paradigm that combines the benefits of both worlds to deliver computationally efficient representations with inherent complexity regularization.
We derive two efficient variational inference techniques to learn these representations and highlight the advantages of hierarchical infinite local regression models.
arXiv Detail & Related papers (2022-11-02T13:54:07Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - Modeling Item Response Theory with Stochastic Variational Inference [8.369065078321215]
We introduce a variational Bayesian inference algorithm for Item Response Theory (IRT)
Applying this method to five large-scale item response datasets yields higher log likelihoods and higher accuracy in imputing missing data.
The algorithm implementation is open-source, and easily usable.
arXiv Detail & Related papers (2021-08-26T05:00:27Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Optimizing Variational Representations of Divergences and Accelerating
their Statistical Estimation [6.34892104858556]
Variational representations of divergences and distances between high-dimensional probability distributions offer significant theoretical insights.
They have gained popularity in machine learning as a tractable and scalable approach for training probabilistic models.
We develop a methodology for building new, tighter variational representations of divergences.
arXiv Detail & Related papers (2020-06-15T21:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.