Related papers: Component-Aware Pruning Framework for Neural Network Controllers via Gradient-Based Importance Estimation

Component-Aware Pruning Framework for Neural Network Controllers via Gradient-Based Importance Estimation

URL: http://arxiv.org/abs/2601.19794v1
Date: Tue, 27 Jan 2026 16:53:19 GMT
Title: Component-Aware Pruning Framework for Neural Network Controllers via Gradient-Based Importance Estimation
Authors: Ganesh Sundaram, Jonas Ulmen, Daniel Görges,
Abstract summary: This paper introduces a component-aware pruning framework that utilizes gradient information to compute three distinct importance metrics during training.<n> Experimental results with an autoencoder and a TDMPC agent demonstrate that the proposed framework reveals critical structural dependencies and dynamic shifts in importance.
Score: 0.34410212782758043
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The transition from monolithic to multi-component neural architectures in advanced neural network controllers poses substantial challenges due to the high computational complexity of the latter. Conventional model compression techniques for complexity reduction, such as structured pruning based on norm-based metrics to estimate the relative importance of distinct parameter groups, often fail to capture functional significance. This paper introduces a component-aware pruning framework that utilizes gradient information to compute three distinct importance metrics during training: Gradient Accumulation, Fisher Information, and Bayesian Uncertainty. Experimental results with an autoencoder and a TD-MPC agent demonstrate that the proposed framework reveals critical structural dependencies and dynamic shifts in importance that static heuristics often miss, supporting more informed compression decisions.

Related papers

Data-Driven Deep MIMO Detection:Network Architectures and Generalization Analysis [50.20709408241935]
This paper proposes inspecting the fully data-driven DeepSIC detection within a Network-of-MLPs architecture.<n>Within such an architecture, DeepSIC can be upgraded as a graph-based message-passing process using Graph Neural Networks (GNNs)<n>GNNSIC achieves excellent expressivity comparable to DeepSIC with substantially fewer trainable parameters.
arXiv Detail & Related papers (2026-02-13T04:38:51Z)
Systematic Characterization of Minimal Deep Learning Architectures: A Unified Analysis of Convergence, Pruning, and Quantization [6.49583548940407]
Deep learning networks excel at classification, yet identifying minimal architectures that reliably solve a task remains challenging.<n>We present a computational methodology for exploring and analyzing the relationships among convergence, pruning, and quantization.<n>Our initial results show that, despite architectural diversity, performance is largely invariant and learning dynamics consistently exhibit three regimes: unstable, learning, and overfitting.
arXiv Detail & Related papers (2026-01-25T20:31:10Z)
Benchmarking neural surrogates on realistic spatiotemporal multiphysics flows [18.240532888032394]
We present REALM (REalistic AI Learning for Multiphysics), a rigorous benchmarking framework designed to test neural surrogates on challenging, application-driven reactive flows.<n>We benchmark over a dozen representative surrogate model families, including spectral operators, convolutional models, Transformers, pointwise operators, and graph/mesh networks.<n>We identify three robust trends: (i) a scaling barrier governed jointly by dimensionality, stiffness, and mesh irregularity, leading to rapidly growing rollout errors; (ii) performance primarily controlled by architectural inductive biases rather than parameter count; and (iii) a persistent gap between nominal accuracy metrics and physically
arXiv Detail & Related papers (2025-12-21T05:04:13Z)
Meta-cognitive Multi-scale Hierarchical Reasoning for Motor Imagery Decoding [43.32839547082765]
This work investigates a hierarchical and meta-cognitive decoding framework for four-class electroencephalogram (EEG) signals.<n>We introduce a multi-scale hierarchical signal processing module that reorganizes backbone features into temporal multi-scale representations.<n>We instantiate this framework on three standard EEG backbones and evaluate four-class MI decoding using the BCI Competition IV-2a dataset.
arXiv Detail & Related papers (2025-11-11T06:32:23Z)
Knowledge-Informed Neural Network for Complex-Valued SAR Image Recognition [51.03674130115878]
We introduce the Knowledge-Informed Neural Network (KINN), a lightweight framework built upon a novel "compression-aggregation-compression" architecture.<n>KINN establishes a state-of-the-art in parameter-efficient recognition, offering exceptional generalization in data-scarce and out-of-distribution scenarios.
arXiv Detail & Related papers (2025-10-23T07:12:26Z)
Explicit modelling of subject dependency in BCI decoding [12.17288254938554]
Brain-Computer Interfaces (BCIs) suffer from high inter-subject variability and limited labeled data.<n>We present an end-to-end approach that explicitly models the subject dependency using lightweight convolutional neural networks (CNNs) conditioned on the subject's identity.
arXiv Detail & Related papers (2025-09-27T10:51:42Z)
Model Hemorrhage and the Robustness Limits of Large Language Models [119.46442117681147]
Large language models (LLMs) demonstrate strong performance across natural language processing tasks, yet undergo significant performance degradation when modified for deployment.<n>We define this phenomenon as model hemorrhage - performance decline caused by parameter alterations and architectural changes.
arXiv Detail & Related papers (2025-03-31T10:16:03Z)
Interpretable Feature Interaction via Statistical Self-supervised Learning on Tabular Data [22.20955211690874]
Spofe is a novel self-supervised machine learning pipeline that captures principled representation to achieve clear interpretability with statistical rigor.<n>Underpinning our approach is a robust theoretical framework that delivers precise error bounds and rigorous false discovery rate (FDR) control.<n>Experiments on diverse real-world datasets demonstrate the effectiveness of Spofe.
arXiv Detail & Related papers (2025-03-23T12:27:42Z)
Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective [125.00228936051657]
We introduce NTK-CL, a novel framework that eliminates task-specific parameter storage while adaptively generating task-relevant features.<n>By fine-tuning optimizable parameters with appropriate regularization, NTK-CL achieves state-of-the-art performance on established PEFT-CL benchmarks.
arXiv Detail & Related papers (2024-07-24T09:30:04Z)
Leveraging Frequency Domain Learning in 3D Vessel Segmentation [50.54833091336862]
In this study, we leverage Fourier domain learning as a substitute for multi-scale convolutional kernels in 3D hierarchical segmentation models. We show that our novel network achieves remarkable dice performance (84.37% on ASACA500 and 80.32% on ImageCAS) in tubular vessel segmentation tasks.
arXiv Detail & Related papers (2024-01-11T19:07:58Z)
Efficient Micro-Structured Weight Unification and Pruning for Neural Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices. Previous unstructured or structured weight pruning methods can hardly truly accelerate inference. We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.