Related papers: Gradient-based Wang-Landau Algorithm: A Novel Sampler for Output Distribution of Neural Networks over the Input Space

Gradient-based Wang-Landau Algorithm: A Novel Sampler for Output Distribution of Neural Networks over the Input Space

URL: http://arxiv.org/abs/2302.09484v2
Date: Tue, 21 Feb 2023 05:50:26 GMT
Title: Gradient-based Wang-Landau Algorithm: A Novel Sampler for Output Distribution of Neural Networks over the Input Space
Authors: Weitang Liu, Ying-Wai Li, Yi-Zhuang You, Jingbo Shang
Abstract summary: In this paper, we propose a novel Gradient-based Wang-Landau (GWL) sampler. We first draw the connection between the output distribution of a NN and the density of states (DOS) of a physical system. Then, we renovate the classic sampler for the DOS problem, the Wang-Landau algorithm, by replacing its random proposals with gradient-based Monte Carlo proposals.
Score: 20.60516313062773
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The output distribution of a neural network (NN) over the entire input space captures the complete input-output mapping relationship, offering insights toward a more comprehensive NN understanding. Exhaustive enumeration or traditional Monte Carlo methods for the entire input space can exhibit impractical sampling time, especially for high-dimensional inputs. To make such difficult sampling computationally feasible, in this paper, we propose a novel Gradient-based Wang-Landau (GWL) sampler. We first draw the connection between the output distribution of a NN and the density of states (DOS) of a physical system. Then, we renovate the classic sampler for the DOS problem, the Wang-Landau algorithm, by replacing its random proposals with gradient-based Monte Carlo proposals. This way, our GWL sampler investigates the under-explored subsets of the input space much more efficiently. Extensive experiments have verified the accuracy of the output distribution generated by GWL and also showcased several interesting findings - for example, in a binary image classification task, both CNN and ResNet mapped the majority of human unrecognizable images to very negative logit values.

Related papers

Sampling-based Distributed Training with Message Passing Neural Network [1.1088875073103417]
We introduce a domain-decomposition-based distributed training and inference approach for message-passing neural networks (MPNN) We present a scalable graph neural network, referred to as DS-MPNN (D and S standing for distributed and sampled), capable of scaling up to $O(105)$ nodes.
arXiv Detail & Related papers (2024-02-23T05:33:43Z)
Sampling weights of deep neural networks [1.2370077627846041]
We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks. In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed. We prove that sampled networks are universal approximators.
arXiv Detail & Related papers (2023-06-29T10:13:36Z)
Vecchia Gaussian Process Ensembles on Internal Representations of Deep Neural Networks [2.186901738997927]
For regression tasks, standard Gaussian processes (GPs) and deep neural networks (DNNs) provide natural uncertainty quantification (UQ) We propose an alternative solution, the deep Vecchia ensemble (DVE), which allows deterministic UQ to work in the presence of feature collapse. DVE is compatible with pretrained networks and incurs low computational overhead.
arXiv Detail & Related papers (2023-05-26T16:19:26Z)
Approximate Thompson Sampling via Epistemic Neural Networks [26.872304174606278]
Epistemic neural networks (ENNs) are designed to produce accurate joint predictive distributions. We show that ENNs serve this purpose well and illustrate how the quality of joint predictive distributions drives performance.
arXiv Detail & Related papers (2023-02-18T01:58:15Z)
Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs) It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z)
Analyzing the Effect of Sampling in GNNs on Individual Fairness [79.28449844690566]
Graph neural network (GNN) based methods have saturated the field of recommender systems. We extend an existing method for promoting individual fairness on graphs to support mini-batch, or sub-sample based, training of a GNN. We show that mini-batch training facilitate individual fairness promotion by allowing for local nuance to guide the process of fairness promotion in representation learning.
arXiv Detail & Related papers (2022-09-08T16:20:25Z)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons. Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z)
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function. We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z)
A Biased Graph Neural Network Sampler with Near-Optimal Regret [57.70126763759996]
Graph neural networks (GNN) have emerged as a vehicle for applying deep network architectures to graph and relational data. In this paper, we build upon existing work and treat GNN neighbor sampling as a multi-armed bandit problem. We introduce a newly-designed reward function that introduces some degree of bias designed to reduce variance and avoid unstable, possibly-unbounded payouts.
arXiv Detail & Related papers (2021-03-01T15:55:58Z)
pseudo-Bayesian Neural Networks for detecting Out of Distribution Inputs [12.429095025814345]
We propose pseudo-BNNs where instead of learning distributions over weights, we use point estimates and perturb weights at the time of inference. Overall, this combination results in a principled technique to detect OOD samples at the time of inference.
arXiv Detail & Related papers (2021-02-02T06:23:04Z)
Bandit Samplers for Training Graph Neural Networks [63.17765191700203]
Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs) These sampling algorithms are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT)
arXiv Detail & Related papers (2020-06-10T12:48:37Z)
Learning to Importance Sample in Primary Sample Space [22.98252856114423]
We propose a novel importance sampling technique that uses a neural network to learn how to sample from a desired density represented by a set of samples. We show that our approach leads to effective variance reduction in several practical scenarios.
arXiv Detail & Related papers (2018-08-23T16:55:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.