Related papers: Local Control Networks (LCNs): Optimizing Flexibility in Neural Network Data Pattern Capture

Local Control Networks (LCNs): Optimizing Flexibility in Neural Network Data Pattern Capture

URL: http://arxiv.org/abs/2501.14000v1
Date: Thu, 23 Jan 2025 11:34:25 GMT
Title: Local Control Networks (LCNs): Optimizing Flexibility in Neural Network Data Pattern Capture
Authors: Hy Nguyen, Duy Khoa Pham, Srikanth Thudumu, Hung Du, Rajesh Vasa, Kon Mouzakis,
Abstract summary: We argue that employing the same activation function at every node is suboptimal and propose leveraging different activation functions at each node to increase flexibility and adaptability.<n>To achieve this, we introduce Local Control Networks (LCNs), which leverage B-spline functions to enable distinct activation curves at each node.<n>Our findings suggest that diverse activations at the node level can lead to improved performance and efficiency.
Score: 0.922664966526494
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The widespread use of Multi-layer perceptrons (MLPs) often relies on a fixed activation function (e.g., ReLU, Sigmoid, Tanh) for all nodes within the hidden layers. While effective in many scenarios, this uniformity may limit the networks ability to capture complex data patterns. We argue that employing the same activation function at every node is suboptimal and propose leveraging different activation functions at each node to increase flexibility and adaptability. To achieve this, we introduce Local Control Networks (LCNs), which leverage B-spline functions to enable distinct activation curves at each node. Our mathematical analysis demonstrates the properties and benefits of LCNs over conventional MLPs. In addition, we demonstrate that more complex architectures, such as Kolmogorov-Arnold Networks (KANs), are unnecessary in certain scenarios, and LCNs can be a more efficient alternative. Empirical experiments on various benchmarks and datasets validate our theoretical findings. In computer vision tasks, LCNs achieve marginal improvements over MLPs and outperform KANs by approximately 5\%, while also being more computationally efficient than KANs. In basic machine learning tasks, LCNs show a 1\% improvement over MLPs and a 0.6\% improvement over KANs. For symbolic formula representation tasks, LCNs perform on par with KANs, with both architectures outperforming MLPs. Our findings suggest that diverse activations at the node level can lead to improved performance and efficiency.

Related papers

KAN or MLP? Point Cloud Shows the Way Forward [13.669234791655075]
We propose PointKAN, which applies Kolmogorov-Arnold Learning Networks (KANs) to point cloud analysis tasks. We show that PointKAN outperforms PointMLP on benchmark datasets such as ModelNet40, ScanNN, and ShapeNetPart. This work highlights the potential of KANs-based architectures in 3D vision and opens new avenues for research in point cloud understanding.
arXiv Detail & Related papers (2025-04-18T09:52:22Z)
AF-KAN: Activation Function-Based Kolmogorov-Arnold Networks for Efficient Representation Learning [4.843466576537832]
Kolmogorov-Arnold Networks (KANs) have inspired numerous works exploring their applications across a wide range of scientific problems. We introduce Activation Function-Based Kolmogorov-Arnold Networks (AF-KAN), expanding ReLU-KAN with various activations and their function combinations. This novel KAN also incorporates parameter reduction methods, primarily attention mechanisms and data normalization, to enhance performance on image classification datasets.
arXiv Detail & Related papers (2025-03-08T07:38:51Z)
Scalable spectral representations for multi-agent reinforcement learning in network MDPs [13.782868855372774]
A popular model for multi-agent control, Network Markov Decision Processes (MDPs) pose a significant challenge to efficient learning. We first derive scalable spectral local representations for network MDPs, which induces a network linear subspace for the local $Q$-function of each agent. We design a scalable algorithmic framework for continuous state-action network MDPs, and provide end-to-end guarantees for the convergence of our algorithm.
arXiv Detail & Related papers (2024-10-22T17:45:45Z)
Activation Space Selectable Kolmogorov-Arnold Networks [29.450377034478933]
Kolmogorov-Arnold Network (KAN), based on nonlinear additive connections, has been proven to achieve performance comparable to Select-based methods. Despite this potential, the use of a single activation function space results in reduced performance of KAN and related works across different tasks. This work contributes to the understanding of the data-centric design of new AI and provides a foundational reference for innovations in KAN-based network architectures.
arXiv Detail & Related papers (2024-08-15T11:34:05Z)
Fully Spiking Actor Network with Intra-layer Connections for Reinforcement Learning [51.386945803485084]
We focus on the task where the agent needs to learn multi-dimensional deterministic policies to control. Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected layer. To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects.
arXiv Detail & Related papers (2024-01-09T07:31:34Z)
LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference [19.5669231249754]
We present LinGCN, a framework designed to reduce multiplication depth and optimize the performance of HE based GCN inference. Remarkably, LinGCN achieves a 14.2x latency speedup relative to CryptoGCN, while preserving an inference accuracy of 75% and notably reducing multiplication depth.
arXiv Detail & Related papers (2023-09-25T17:56:54Z)
Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference [13.924924047051782]
Deep convolution architectures for Spiking Neural Networks (SNNs) have significantly enhanced image classification performance and reduced computational burdens. This research explores a new pathway, drawing inspiration from the progress made in Multi-Layer Perceptrons (MLPs) We propose an innovative spiking architecture that uses batch normalization to retain MFI compatibility. We establish an efficient multi-stage spiking network that blends effectively global receptive fields with local feature extraction.
arXiv Detail & Related papers (2023-06-21T16:52:20Z)
Towards Energy-Efficient, Low-Latency and Accurate Spiking LSTMs [1.7969777786551424]
Spiking Neural Networks (SNNs) have emerged as an attractive-temporal computing paradigm vision for complex tasks. We propose an optimized spiking long short-term memory networks (LSTM) training framework that involves a novel. rev-to-SNN conversion framework, followed by SNN training. We evaluate our framework on sequential learning tasks including temporal M, Google Speech Commands (GSC) datasets, and UCI Smartphone on different LSTM architectures.
arXiv Detail & Related papers (2022-10-23T04:10:27Z)
Metric Residual Networks for Sample Efficient Goal-conditioned Reinforcement Learning [52.59242013527014]
Goal-conditioned reinforcement learning (GCRL) has a wide range of potential real-world applications. Sample efficiency is of utmost importance for GCRL since, by default, the agent is only rewarded when it reaches its goal. We introduce a novel neural architecture for GCRL that achieves significantly better sample efficiency than the commonly-used monolithic network architecture.
arXiv Detail & Related papers (2022-08-17T08:04:41Z)
On Feature Learning in Neural Networks with Global Convergence Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF) We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF. We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z)
Edge Rewiring Goes Neural: Boosting Network Resilience via Policy Gradient [62.660451283548724]
ResiNet is a reinforcement learning framework to discover resilient network topologies against various disasters and attacks. We show that ResiNet achieves a near-optimal resilience gain on multiple graphs while balancing the utility, with a large margin compared to existing approaches.
arXiv Detail & Related papers (2021-10-18T06:14:28Z)
Towards Efficient Graph Convolutional Networks for Point Cloud Handling [181.59146413326056]
We aim at improving the computational efficiency of graph convolutional networks (GCNs) for learning on point clouds. A series of experiments show that optimized networks have reduced computational complexity, decreased memory consumption, and accelerated inference speed.
arXiv Detail & Related papers (2021-04-12T17:59:16Z)
Convolutional Networks with Dense Connectivity [59.30634544498946]
We introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks.
arXiv Detail & Related papers (2020-01-08T06:54:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.