Related papers: R2DN: Scalable Parameterization of Contracting and Lipschitz Recurrent Deep Networks

R2DN: Scalable Parameterization of Contracting and Lipschitz Recurrent Deep Networks

URL: http://arxiv.org/abs/2504.01250v1
Date: Tue, 01 Apr 2025 23:37:43 GMT
Title: R2DN: Scalable Parameterization of Contracting and Lipschitz Recurrent Deep Networks
Authors: Nicholas H. Barbara, Ruigang Wang, Ian R. Manchester,
Abstract summary: The Robust Recurrent Deep Network (R2DN) is a scalable parameterization of robust recurrent neural networks for machine learning and data-driven control.<n>We construct R2DNs as a feedback of a linear time-invariant system and a 1-Lipschitz deep feedforward network.<n>We find that training and inference are both up to an order of magnitude faster with similar test set performance, and that training/inference times scale more favorably with respect to model expressivity.
Score: 1.1060425537315086
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents the Robust Recurrent Deep Network (R2DN), a scalable parameterization of robust recurrent neural networks for machine learning and data-driven control. We construct R2DNs as a feedback interconnection of a linear time-invariant system and a 1-Lipschitz deep feedforward network, and directly parameterize the weights so that our models are stable (contracting) and robust to small input perturbations (Lipschitz) by design. Our parameterization uses a structure similar to the previously-proposed recurrent equilibrium networks (RENs), but without the requirement to iteratively solve an equilibrium layer at each time-step. This speeds up model evaluation and backpropagation on GPUs, and makes it computationally feasible to scale up the network size, batch size, and input sequence length in comparison to RENs. We compare R2DNs to RENs on three representative problems in nonlinear system identification, observer design, and learning-based feedback control and find that training and inference are both up to an order of magnitude faster with similar test set performance, and that training/inference times scale more favorably with respect to model expressivity.

Related papers

Parallel Neural Computing for Scene Understanding from LiDAR Perception in Autonomous Racing [0.0]
Traditional sequential network approaches may struggle to meet the real-time knowledge and decision-making demands of an autonomous agent.<n>This paper proposes a novel baseline architecture for developing sophisticated models capable of true hardware-enabled parallelism.<n>The proposed model takes raw 3D point cloud data from the LiDAR sensor as input and converts it into a 2D Bird's Eye View Map on both devices.
arXiv Detail & Related papers (2024-12-24T04:56:32Z)
Learning Load Balancing with GNN in MPTCP-Enabled Heterogeneous Networks [13.178956651532213]
We propose a graph neural network (GNN)-based model to tackle the LB problem for MP TCP-enabled HetNets. Compared to the conventional deep neural network (DNN), the proposed GNN-based model exhibits two key strengths.
arXiv Detail & Related papers (2024-10-22T15:49:53Z)
Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs) Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators. Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
SVNet: Where SO(3) Equivariance Meets Binarization on Point Cloud Representation [65.4396959244269]
The paper tackles the challenge by designing a general framework to construct 3D learning architectures. The proposed approach can be applied to general backbones like PointNet and DGCNN. Experiments on ModelNet40, ShapeNet, and the real-world dataset ScanObjectNN, demonstrated that the method achieves a great trade-off between efficiency, rotation, and accuracy.
arXiv Detail & Related papers (2022-09-13T12:12:19Z)
CS-Rep: Making Speaker Verification Networks Embracing Re-parameterization [27.38202134344989]
This study proposes cross-sequential re- parameterization (CS-Rep) to increase the inference speed and verification accuracy of models. Rep-TDNN increases the actual inference speed by about 50% and reduces the EER by 10%.
arXiv Detail & Related papers (2021-10-26T08:00:03Z)
The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks. We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance. Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z)
Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML [13.325670094073383]
We present the implementation of binary and ternary neural networks in the hls4ml library. We discuss the trade-off between model accuracy and resource consumption. The binary and ternary implementation has similar performance to the higher precision implementation while using drastically fewer FPGA resources.
arXiv Detail & Related papers (2020-03-11T10:46:51Z)
Toward fast and accurate human pose estimation via soft-gated skip connections [97.06882200076096]
This paper is on highly accurate and highly efficient human pose estimation. We re-analyze this design choice in the context of improving both the accuracy and the efficiency over the state-of-the-art. Our model achieves state-of-the-art results on the MPII and LSP datasets.
arXiv Detail & Related papers (2020-02-25T18:51:51Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
Volterra Neural Networks (VNNs) [24.12314339259243]
We propose a Volterra filter-inspired Network architecture to reduce the complexity of Convolutional Neural Networks. We show an efficient parallel implementation of this Volterra Neural Network (VNN) along with its remarkable performance. The proposed approach is evaluated on UCF-101 and HMDB-51 datasets for action recognition, and is shown to outperform state of the art CNN approaches.
arXiv Detail & Related papers (2019-10-21T19:22:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.