Optimal Connectivity through Network Gradients for the Restricted
Boltzmann Machine
- URL: http://arxiv.org/abs/2209.06932v1
- Date: Wed, 14 Sep 2022 21:09:58 GMT
- Title: Optimal Connectivity through Network Gradients for the Restricted
Boltzmann Machine
- Authors: A. C. N. de Oliveira and D. R. Figueiredo
- Abstract summary: A fundamental problem is efficiently finding connectivity patterns that improve the learning curve.
Recent approaches explicitly include network connections as parameters that must be optimized in the model.
This work presents a method to find optimal connectivity patterns for RBMs based on the idea of network gradients.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Leveraging sparse networks to connect successive layers in deep neural
networks has recently been shown to provide benefits to large scale
state-of-the-art models. However, network connectivity also plays a significant
role on the learning curves of shallow networks, such as the classic Restricted
Boltzmann Machines (RBM). A fundamental problem is efficiently finding
connectivity patterns that improve the learning curve. Recent principled
approaches explicitly include network connections as parameters that must be
optimized in the model, but often rely on continuous functions to represent
connections and on explicit penalization. This work presents a method to find
optimal connectivity patterns for RBMs based on the idea of network gradients:
computing the gradient of every possible connection, given a specific
connection pattern, and using the gradient to drive a continuous connection
strength parameter that in turn is used to determine the connection pattern.
Thus, learning RBM parameters and learning network connections is truly jointly
performed, albeit with different learning rates, and without changes to the
objective function. The method is applied to the MNIST data set showing that
better RBM models are found for the benchmark tasks of sample generation and
input classification.
Related papers
- Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust
Closed-Loop Control [63.310780486820796]
We show how a parameterization of recurrent connectivity influences robustness in closed-loop settings.
We find that closed-form continuous-time neural networks (CfCs) with fewer parameters can outperform their full-rank, fully-connected counterparts.
arXiv Detail & Related papers (2023-10-05T21:44:18Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Packet Routing with Graph Attention Multi-agent Reinforcement Learning [4.78921052969006]
We develop a model-free and data-driven routing strategy by leveraging reinforcement learning (RL)
Considering the graph nature of the network topology, we design a multi-agent RL framework in combination with Graph Neural Network (GNN)
arXiv Detail & Related papers (2021-07-28T06:20:34Z) - Mutually exciting point process graphs for modelling dynamic networks [0.0]
A new class of models for dynamic networks is proposed, called mutually exciting point process graphs (MEG)
MEG is a scalable network-wide statistical model for point processes with dyadic marks, which can be used for anomaly detection.
The model is tested on simulated graphs and real world computer network datasets, demonstrating excellent performance.
arXiv Detail & Related papers (2021-02-11T10:14:55Z) - Attentional Local Contrast Networks for Infrared Small Target Detection [15.882749652217653]
We propose a novel model-driven deep network for infrared small target detection.
We modularize a conventional local contrast measure method as a depth-wise parameterless nonlinear feature refinement layer in an end-to-end network.
We conduct detailed ablation studies with varying network depths to empirically verify the effectiveness and efficiency of each component in our network architecture.
arXiv Detail & Related papers (2020-12-15T19:33:09Z) - DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator
Search [55.164053971213576]
convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead.
Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure.
Existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space.
arXiv Detail & Related papers (2020-11-04T07:43:01Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - From Boltzmann Machines to Neural Networks and Back Again [31.613544605376624]
We give new results for learning Restricted Boltzmann Machines, probably the most well-studied class of latent variable models.
Our results are based on new connections to learning two-layer neural networks under $ell_infty$ bounded input.
We then give an algorithm for learning a natural class of supervised RBMs with better runtime than what is possible for its related class of networks without distributional assumptions.
arXiv Detail & Related papers (2020-07-25T00:42:50Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.