Related papers: Optimizing Connectivity through Network Gradients for Restricted Boltzmann Machines

Optimizing Connectivity through Network Gradients for Restricted Boltzmann Machines

URL: http://arxiv.org/abs/2209.06932v4
Date: Fri, 30 May 2025 00:41:16 GMT
Title: Optimizing Connectivity through Network Gradients for Restricted Boltzmann Machines
Authors: A. C. N. de Oliveira, D. R. Figueiredo,
Abstract summary: Network connectivity plays a significant role in the learning performance of shallow networks.<n>This work presents an optimization method to find optimal connectivity patterns for RBMs.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Leveraging sparse networks to connect successive layers in deep neural networks has recently been shown to provide benefits to large-scale state-of-the-art models. However, network connectivity also plays a significant role in the learning performance of shallow networks, such as the classic Restricted Boltzmann Machine (RBM). Efficiently finding sparse connectivity patterns that improve the learning performance of shallow networks is a fundamental problem. While recent principled approaches explicitly include network connections as model parameters that must be optimized, they often rely on explicit penalization or network sparsity as a hyperparameter. This work presents the Network Connectivity Gradients (NCG), an optimization method to find optimal connectivity patterns for RBMs. NCG leverages the idea of network gradients: given a specific connection pattern, it determines the gradient of every possible connection and uses the gradient to drive a continuous connection strength parameter that in turn is used to determine the connection pattern. Thus, learning RBM parameters and learning network connections is truly jointly performed, albeit with different learning rates, and without changes to the model's classic energy-based objective function. The proposed method is applied to the MNIST and other data sets showing that better RBM models are found for the benchmark tasks of sample generation and classification. Results also show that NCG is robust to network initialization and is capable of both adding and removing network connections while learning.

Related papers

ANCRe: Adaptive Neural Connection Reassignment for Efficient Depth Scaling [57.91760520589592]
Scaling network depth has been a central driver behind the success of modern foundation models.<n>This paper revisits the default mechanism for deepening neural networks, namely residual connections.<n>We introduce adaptive neural connection reassignment (ANCRe), a principled and lightweight framework that parameterizes and learns residual connectivities from the data.
arXiv Detail & Related papers (2026-02-09T18:54:18Z)
Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning [57.3885832382455]
We show that introducing static network sparsity alone can unlock further scaling potential beyond dense counterparts with state-of-the-art architectures.<n>Our analysis reveals that, in contrast to naively scaling up dense DRL networks, such sparse networks achieve both higher parameter efficiency for network expressivity.
arXiv Detail & Related papers (2025-06-20T17:54:24Z)
Lattice-Based Pruning in Recurrent Neural Networks via Poset Modeling [0.0]
Recurrent neural networks (RNNs) are central to sequence modeling tasks, yet their high computational complexity poses challenges for scalability and real-time deployment. We introduce a novel framework that models RNNs as partially ordered sets (posets) and constructs corresponding dependency lattices. By identifying meet irreducible neurons, our lattice-based pruning algorithm selectively retains critical connections while eliminating redundant ones.
arXiv Detail & Related papers (2025-02-23T10:11:38Z)
Ghost-Connect Net: A Generalization-Enhanced Guidance For Sparse Deep Networks Under Distribution Shifts [5.524804393257921]
We introduce Ghost Connect-Net (GC-Net) to monitor the connections in the original network with distribution generalization advantage. After pruning GC-Net, the pruned locations are mapped back to the original network as pruned connections. We provide theoretical foundations for GC-Net's approach to improving generalization under distribution shifts.
arXiv Detail & Related papers (2024-11-14T05:43:42Z)
Learning Load Balancing with GNN in MPTCP-Enabled Heterogeneous Networks [13.178956651532213]
We propose a graph neural network (GNN)-based model to tackle the LB problem for MP TCP-enabled HetNets. Compared to the conventional deep neural network (DNN), the proposed GNN-based model exhibits two key strengths.
arXiv Detail & Related papers (2024-10-22T15:49:53Z)
Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust Closed-Loop Control [63.310780486820796]
We show how a parameterization of recurrent connectivity influences robustness in closed-loop settings. We find that closed-form continuous-time neural networks (CfCs) with fewer parameters can outperform their full-rank, fully-connected counterparts.
arXiv Detail & Related papers (2023-10-05T21:44:18Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
Packet Routing with Graph Attention Multi-agent Reinforcement Learning [4.78921052969006]
We develop a model-free and data-driven routing strategy by leveraging reinforcement learning (RL) Considering the graph nature of the network topology, we design a multi-agent RL framework in combination with Graph Neural Network (GNN)
arXiv Detail & Related papers (2021-07-28T06:20:34Z)
Mutually exciting point process graphs for modelling dynamic networks [0.0]
A new class of models for dynamic networks is proposed, called mutually exciting point process graphs (MEG) MEG is a scalable network-wide statistical model for point processes with dyadic marks, which can be used for anomaly detection. The model is tested on simulated graphs and real world computer network datasets, demonstrating excellent performance.
arXiv Detail & Related papers (2021-02-11T10:14:55Z)
Attentional Local Contrast Networks for Infrared Small Target Detection [15.882749652217653]
We propose a novel model-driven deep network for infrared small target detection. We modularize a conventional local contrast measure method as a depth-wise parameterless nonlinear feature refinement layer in an end-to-end network. We conduct detailed ablation studies with varying network depths to empirically verify the effectiveness and efficiency of each component in our network architecture.
arXiv Detail & Related papers (2020-12-15T19:33:09Z)
DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator Search [55.164053971213576]
convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead. Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure. Existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space.
arXiv Detail & Related papers (2020-11-04T07:43:01Z)
Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths. Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z)
HALO: Learning to Prune Neural Networks with Shrinkage [5.283963846188862]
Deep neural networks achieve state-of-the-art performance in a variety of tasks by extracting a rich set of features from unstructured data. Modern techniques for inducing sparsity and reducing model size are (1) network pruning, (2) training with a sparsity inducing penalty, and (3) training a binary mask jointly with the weights of the network. We present a novel penalty called Hierarchical Adaptive Lasso which learns to adaptively sparsify weights of a given network via trainable parameters.
arXiv Detail & Related papers (2020-08-24T04:08:48Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
From Boltzmann Machines to Neural Networks and Back Again [31.613544605376624]
We give new results for learning Restricted Boltzmann Machines, probably the most well-studied class of latent variable models. Our results are based on new connections to learning two-layer neural networks under $ell_infty$ bounded input. We then give an algorithm for learning a natural class of supervised RBMs with better runtime than what is possible for its related class of networks without distributional assumptions.
arXiv Detail & Related papers (2020-07-25T00:42:50Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.