Related papers: EUGens: Efficient, Unified, and General Dense Layers

EUGens: Efficient, Unified, and General Dense Layers

URL: http://arxiv.org/abs/2601.22563v2
Date: Mon, 02 Feb 2026 17:47:29 GMT
Title: EUGens: Efficient, Unified, and General Dense Layers
Authors: Sang Min Kim, Byeongchan Kim, Arijit Sehanobish, Somnath Basu Roy Chowdhury, Rahul Kidambi, Dongseok Shim, Avinava Dubey, Snigdha Chaturvedi, Min-hwan Oh, Krzysztof Choromanski,
Abstract summary: We propose a new class of dense layers that generalize standard fully-connected feedforward layers, textbfEfficient, textbfUnimat and textbfGeneral dense layers (EUGens)<n>EUGens leverage random features to approximate standard FFLs and go beyond them by incorporating a direct dependence on the input norms in their computations.
Score: 56.498769704575544
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Efficient neural networks are essential for scaling machine learning models to real-time applications and resource-constrained environments. Fully-connected feedforward layers (FFLs) introduce computation and parameter count bottlenecks within neural network architectures. To address this challenge, in this work, we propose a new class of dense layers that generalize standard fully-connected feedforward layers, \textbf{E}fficient, \textbf{U}nified and \textbf{Gen}eral dense layers (EUGens). EUGens leverage random features to approximate standard FFLs and go beyond them by incorporating a direct dependence on the input norms in their computations. The proposed layers unify existing efficient FFL extensions and improve efficiency by reducing inference complexity from quadratic to linear time. They also lead to \textbf{the first} unbiased algorithms approximating FFLs with arbitrary polynomial activation functions. Furthermore, EuGens reduce the parameter count and computational overhead while preserving the expressive power and adaptability of FFLs. We also present a layer-wise knowledge transfer technique that bypasses backpropagation, enabling efficient adaptation of EUGens to pre-trained models. Empirically, we observe that integrating EUGens into Transformers and MLPs yields substantial improvements in inference speed (up to \textbf{27}\%) and memory efficiency (up to \textbf{30}\%) across a range of tasks, including image classification, language model pre-training, and 3D scene reconstruction. Overall, our results highlight the potential of EUGens for the scalable deployment of large-scale neural networks in real-world scenarios.

Related papers

CollaPipe: Adaptive Segment-Optimized Pipeline Parallelism for Collaborative LLM Training in Heterogeneous Edge Networks [57.95170323315603]
We introduce CollaPipe, a distributed learning framework that integrates collaborative pipeline parallelism with federated aggregation to support self-evolving networks.<n>In CollaPipe, the encoder part is adaptively partitioned into variable-sized segments and deployed across mobile devices for pipeline-parallel training, while the decoder is deployed on edge servers to handle generative tasks.<n>To enhance training efficiency, we formulate a joint optimization problem that adaptively allocates model segments, micro-batches, bandwidth, and transmission power.
arXiv Detail & Related papers (2025-09-24T07:54:01Z)
GENE-FL: Gene-Driven Parameter-Efficient Dynamic Federated Learning [43.967121817631046]
We propose a Gene-driven parameter-efficient dynamic Federated Learning (GENE-FL) framework.<n>First, local models perform quadratic constraints based on parameters with high Fisher values in the global model.<n>Second, we apply the strategy of parameter sensitivity analysis in local model parameters to condense the textitlearnGene for interaction.<n>Third, the server aggregates these small-scale trained textitlearnGenes into a robust textitlearnGene with cross-task generalization capability.
arXiv Detail & Related papers (2025-04-20T14:10:02Z)
PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection [68.8373788348678]
Visual instruction tuning adapts pre-trained Multimodal Large Language Models to follow human instructions.<n>PRISM is the first training-free framework for efficient visual instruction selection.<n>It reduces the end-to-end time for data selection and model tuning to just 30% of conventional pipelines.
arXiv Detail & Related papers (2025-02-17T18:43:41Z)
Constraints and Variables Reduction for Optimal Power Flow Using Hierarchical Graph Neural Networks with Virtual Node-Splitting [0.24554686192257422]
Power system networks are often modeled as homogeneous graphs, which limits the ability of graph neural network (GNN) to capture individual generator features at the same nodes.<n>By introducing the proposed virtual node-splitting strategy, generator-level attributes like costs, limits, and ramp rates can be fully captured by GNN models.<n>Two-stage adaptive hierarchical GNN is developed to (i) predict critical lines that would be congested, and then (ii) predict base generators that would operate at the maximum capacity.
arXiv Detail & Related papers (2024-11-09T19:46:28Z)
LipKernel: Lipschitz-Bounded Convolutional Neural Networks via Dissipative Layers [0.0468732641979009]
We propose a layer-wise parameterization for convolutional neural networks (CNNs) that includes built-in robustness guarantees. Our method Lip Kernel directly parameterizes dissipative convolution kernels using a 2-D Roesser-type state space model. We show that the run-time using our method is orders of magnitude faster than state-of-the-art Lipschitz-bounded networks.
arXiv Detail & Related papers (2024-10-29T17:20:14Z)
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency. We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs) We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z)
Want to train KANS at scale? Now UKAN! [2.9666099400348607]
We present Unbounded Kolmogorov-Arnold Networks (UKANs), a method that removes the need for bounded grids in traditional Kolmogorov-Arnold Networks (KANs)<n>UKANs couple multilayer perceptrons with KANs by feeding the positional encoding of grid groups into the CG model, enabling function approximation on unbounded domains without requiring data normalization.
arXiv Detail & Related papers (2024-08-20T21:20:38Z)
LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation [64.34935748707673]
Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors. We propose a novel method of Learning Resampling (termed LeRF) which takes advantage of both the structural priors learned by DNNs and the locally continuous assumption. LeRF assigns spatially varying resampling functions to input image pixels and learns to predict the shapes of these resampling functions with a neural network.
arXiv Detail & Related papers (2024-07-13T16:09:45Z)
SpaFL: Communication-Efficient Federated Learning with Sparse Models and Low computational Overhead [75.87007729801304]
SpaFL: a communication-efficient FL framework is proposed to optimize sparse model structures with low computational overhead.<n>To optimize the pruning process itself, only thresholds are communicated between a server and clients instead of parameters.<n>Global thresholds are used to update model parameters by extracting aggregated parameter importance.
arXiv Detail & Related papers (2024-06-01T13:10:35Z)
A Masked Pruning Approach for Dimensionality Reduction in Communication-Efficient Federated Learning Systems [11.639503711252663]
Federated Learning (FL) represents a growing machine learning (ML) paradigm designed for training models across numerous nodes. We develop a novel algorithm that overcomes limitations by combining a pruning-based method with the FL process. We present an extensive experimental study demonstrating the superior performance of MPFL compared to existing methods.
arXiv Detail & Related papers (2023-12-06T20:29:23Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
GIFD: A Generative Gradient Inversion Method with Feature Domain Optimization [52.55628139825667]
Federated Learning (FL) has emerged as a promising distributed machine learning framework to preserve clients' privacy. Recent studies find that an attacker can invert the shared gradients and recover sensitive data against an FL system by leveraging pre-trained generative adversarial networks (GAN) as prior knowledge. We propose textbfGradient textbfInversion over textbfFeature textbfDomains (GIFD), which disassembles the GAN model and searches the feature domains of the intermediate layers.
arXiv Detail & Related papers (2023-08-09T04:34:21Z)
Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization [4.0554893636822]
We introduce a novel approach to deploy large-scale Deep Neural Networks on constrained resources. The method speeds up inference time and aims to reduce memory demand and power consumption.
arXiv Detail & Related papers (2022-12-25T15:40:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.