Related papers: Optimal Connectivity through Network Gradients for the Restricted Boltzmann Machine

Optimal Connectivity through Network Gradients for the Restricted Boltzmann Machine

URL: http://arxiv.org/abs/2209.06932v1
Date: Wed, 14 Sep 2022 21:09:58 GMT
Title: Optimal Connectivity through Network Gradients for the Restricted Boltzmann Machine
Authors: A. C. N. de Oliveira and D. R. Figueiredo
Abstract summary: A fundamental problem is efficiently finding connectivity patterns that improve the learning curve. Recent approaches explicitly include network connections as parameters that must be optimized in the model. This work presents a method to find optimal connectivity patterns for RBMs based on the idea of network gradients.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Leveraging sparse networks to connect successive layers in deep neural networks has recently been shown to provide benefits to large scale state-of-the-art models. However, network connectivity also plays a significant role on the learning curves of shallow networks, such as the classic Restricted Boltzmann Machines (RBM). A fundamental problem is efficiently finding connectivity patterns that improve the learning curve. Recent principled approaches explicitly include network connections as parameters that must be optimized in the model, but often rely on continuous functions to represent connections and on explicit penalization. This work presents a method to find optimal connectivity patterns for RBMs based on the idea of network gradients: computing the gradient of every possible connection, given a specific connection pattern, and using the gradient to drive a continuous connection strength parameter that in turn is used to determine the connection pattern. Thus, learning RBM parameters and learning network connections is truly jointly performed, albeit with different learning rates, and without changes to the objective function. The method is applied to the MNIST data set showing that better RBM models are found for the benchmark tasks of sample generation and input classification.

Related papers

Lattice-Based Pruning in Recurrent Neural Networks via Poset Modeling [0.0]
Recurrent neural networks (RNNs) are central to sequence modeling tasks, yet their high computational complexity poses challenges for scalability and real-time deployment. We introduce a novel framework that models RNNs as partially ordered sets (posets) and constructs corresponding dependency lattices. By identifying meet irreducible neurons, our lattice-based pruning algorithm selectively retains critical connections while eliminating redundant ones.
arXiv Detail & Related papers (2025-02-23T10:11:38Z)
Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust Closed-Loop Control [63.310780486820796]
We show how a parameterization of recurrent connectivity influences robustness in closed-loop settings. We find that closed-form continuous-time neural networks (CfCs) with fewer parameters can outperform their full-rank, fully-connected counterparts.
arXiv Detail & Related papers (2023-10-05T21:44:18Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
Packet Routing with Graph Attention Multi-agent Reinforcement Learning [4.78921052969006]
We develop a model-free and data-driven routing strategy by leveraging reinforcement learning (RL) Considering the graph nature of the network topology, we design a multi-agent RL framework in combination with Graph Neural Network (GNN)
arXiv Detail & Related papers (2021-07-28T06:20:34Z)
Mutually exciting point process graphs for modelling dynamic networks [0.0]
A new class of models for dynamic networks is proposed, called mutually exciting point process graphs (MEG) MEG is a scalable network-wide statistical model for point processes with dyadic marks, which can be used for anomaly detection. The model is tested on simulated graphs and real world computer network datasets, demonstrating excellent performance.
arXiv Detail & Related papers (2021-02-11T10:14:55Z)
Attentional Local Contrast Networks for Infrared Small Target Detection [15.882749652217653]
We propose a novel model-driven deep network for infrared small target detection. We modularize a conventional local contrast measure method as a depth-wise parameterless nonlinear feature refinement layer in an end-to-end network. We conduct detailed ablation studies with varying network depths to empirically verify the effectiveness and efficiency of each component in our network architecture.
arXiv Detail & Related papers (2020-12-15T19:33:09Z)
DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator Search [55.164053971213576]
convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead. Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure. Existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space.
arXiv Detail & Related papers (2020-11-04T07:43:01Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
From Boltzmann Machines to Neural Networks and Back Again [31.613544605376624]
We give new results for learning Restricted Boltzmann Machines, probably the most well-studied class of latent variable models. Our results are based on new connections to learning two-layer neural networks under $ell_infty$ bounded input. We then give an algorithm for learning a natural class of supervised RBMs with better runtime than what is possible for its related class of networks without distributional assumptions.
arXiv Detail & Related papers (2020-07-25T00:42:50Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.