Gradient-based Wang-Landau Algorithm: A Novel Sampler for Output
Distribution of Neural Networks over the Input Space
- URL: http://arxiv.org/abs/2302.09484v2
- Date: Tue, 21 Feb 2023 05:50:26 GMT
- Title: Gradient-based Wang-Landau Algorithm: A Novel Sampler for Output
Distribution of Neural Networks over the Input Space
- Authors: Weitang Liu, Ying-Wai Li, Yi-Zhuang You, Jingbo Shang
- Abstract summary: In this paper, we propose a novel Gradient-based Wang-Landau (GWL) sampler.
We first draw the connection between the output distribution of a NN and the density of states (DOS) of a physical system.
Then, we renovate the classic sampler for the DOS problem, the Wang-Landau algorithm, by replacing its random proposals with gradient-based Monte Carlo proposals.
- Score: 20.60516313062773
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The output distribution of a neural network (NN) over the entire input space
captures the complete input-output mapping relationship, offering insights
toward a more comprehensive NN understanding. Exhaustive enumeration or
traditional Monte Carlo methods for the entire input space can exhibit
impractical sampling time, especially for high-dimensional inputs. To make such
difficult sampling computationally feasible, in this paper, we propose a novel
Gradient-based Wang-Landau (GWL) sampler. We first draw the connection between
the output distribution of a NN and the density of states (DOS) of a physical
system. Then, we renovate the classic sampler for the DOS problem, the
Wang-Landau algorithm, by replacing its random proposals with gradient-based
Monte Carlo proposals. This way, our GWL sampler investigates the
under-explored subsets of the input space much more efficiently. Extensive
experiments have verified the accuracy of the output distribution generated by
GWL and also showcased several interesting findings - for example, in a binary
image classification task, both CNN and ResNet mapped the majority of human
unrecognizable images to very negative logit values.
Related papers
- Sampling-based Distributed Training with Message Passing Neural Network [1.1088875073103417]
We introduce a domain-decomposition-based distributed training and inference approach for message-passing neural networks (MPNN)
We present a scalable graph neural network, referred to as DS-MPNN (D and S standing for distributed and sampled), capable of scaling up to $O(105)$ nodes.
arXiv Detail & Related papers (2024-02-23T05:33:43Z) - Sampling weights of deep neural networks [1.2370077627846041]
We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks.
In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed.
We prove that sampled networks are universal approximators.
arXiv Detail & Related papers (2023-06-29T10:13:36Z) - Approximate Thompson Sampling via Epistemic Neural Networks [26.872304174606278]
Epistemic neural networks (ENNs) are designed to produce accurate joint predictive distributions.
We show that ENNs serve this purpose well and illustrate how the quality of joint predictive distributions drives performance.
arXiv Detail & Related papers (2023-02-18T01:58:15Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - Analyzing the Effect of Sampling in GNNs on Individual Fairness [79.28449844690566]
Graph neural network (GNN) based methods have saturated the field of recommender systems.
We extend an existing method for promoting individual fairness on graphs to support mini-batch, or sub-sample based, training of a GNN.
We show that mini-batch training facilitate individual fairness promotion by allowing for local nuance to guide the process of fairness promotion in representation learning.
arXiv Detail & Related papers (2022-09-08T16:20:25Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - A Biased Graph Neural Network Sampler with Near-Optimal Regret [57.70126763759996]
Graph neural networks (GNN) have emerged as a vehicle for applying deep network architectures to graph and relational data.
In this paper, we build upon existing work and treat GNN neighbor sampling as a multi-armed bandit problem.
We introduce a newly-designed reward function that introduces some degree of bias designed to reduce variance and avoid unstable, possibly-unbounded payouts.
arXiv Detail & Related papers (2021-03-01T15:55:58Z) - pseudo-Bayesian Neural Networks for detecting Out of Distribution Inputs [12.429095025814345]
We propose pseudo-BNNs where instead of learning distributions over weights, we use point estimates and perturb weights at the time of inference.
Overall, this combination results in a principled technique to detect OOD samples at the time of inference.
arXiv Detail & Related papers (2021-02-02T06:23:04Z) - Bandit Samplers for Training Graph Neural Networks [63.17765191700203]
Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs)
These sampling algorithms are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT)
arXiv Detail & Related papers (2020-06-10T12:48:37Z) - Learning to Importance Sample in Primary Sample Space [22.98252856114423]
We propose a novel importance sampling technique that uses a neural network to learn how to sample from a desired density represented by a set of samples.
We show that our approach leads to effective variance reduction in several practical scenarios.
arXiv Detail & Related papers (2018-08-23T16:55:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.