Related papers: PerNodeDrop: A Method Balancing Specialized Subnets and Regularization in Deep Neural Networks

PerNodeDrop: A Method Balancing Specialized Subnets and Regularization in Deep Neural Networks

URL: http://arxiv.org/abs/2512.12663v1
Date: Sun, 14 Dec 2025 12:26:56 GMT
Title: PerNodeDrop: A Method Balancing Specialized Subnets and Regularization in Deep Neural Networks
Authors: Gelesh G Omathil, Sreeja CS,
Abstract summary: PerNodeDrop is a lightweight regularization method for deep neural networks.<n>It applies per-sample, per-node perturbations to break the uniformity of the noise injected by existing techniques.<n>It preserves useful co-adaptation while applying regularization.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks possess strong representational capacity yet remain vulnerable to overfitting, primarily because neurons tend to co-adapt in ways that, while capturing complex and fine-grained feature interactions, also reinforce spurious and non-generalizable patterns that inflate training performance but reduce reliability on unseen data. Noise-based regularizers such as Dropout and DropConnect address this issue by injecting stochastic perturbations during training, but the noise they apply is typically uniform across a layer or across a batch of samples, which can suppress both harmful and beneficial co-adaptation. This work introduces PerNodeDrop, a lightweight stochastic regularization method. It applies per-sample, per-node perturbations to break the uniformity of the noise injected by existing techniques, thereby allowing each node to experience input-specific variability. Hence, PerNodeDrop preserves useful co-adaptation while applying regularization. This narrows the gap between training and validation performance and improves reliability on unseen data, as evident from the experiments. Although superficially similar to DropConnect, PerNodeDrop operates at the sample level. It drops weights at the sample level, not the batch level. An expected-loss analysis formalizes how its perturbations attenuate excessive co-adaptation while retaining predictive interactions. Empirical evaluations on vision, text, and audio benchmarks indicate improved generalization relative to the standard noise-based regularizer.

Related papers

DropoutTS: Sample-Adaptive Dropout for Robust Time Series Forecasting [59.868414584142336]
DropoutTS is a model-agnostic plugin that shifts the paradigm from "what" to "how much" to learn.<n>It maps noise to adaptive dropout rates - selectively suppressing spurious fluctuations while preserving fine-grained fidelity.
arXiv Detail & Related papers (2026-01-29T13:49:20Z)
Combating Noisy Labels through Fostering Self- and Neighbor-Consistency [120.4394402099635]
Label noise is pervasive in various real-world scenarios, posing challenges in supervised deep learning.<n>We propose a noise-robust method named Jo-SNC (textbfJoint sample selection and model regularization based on textbfSelf- and textbfNeighbor-textbfConsistency)<n>We design a self-adaptive, data-driven thresholding scheme to adjust per-class selection thresholds.
arXiv Detail & Related papers (2026-01-19T07:55:29Z)
Drainage: A Unifying Framework for Addressing Class Uncertainty [5.204620543649613]
We propose a unified framework based on the concept of a "drainage node" which we add at the output of the network.<n>This mechanism provides a natural escape route for highly ambiguous, anomalous, or noisy samples.<n>Our formulation achieves an accuracy increase of up to 9% over existing approaches in the high-noise regime.
arXiv Detail & Related papers (2025-12-02T19:31:01Z)
Analytic theory of dropout regularization [4.939584372342451]
Dropout is a regularization technique widely used in training artificial neural networks.<n>We analytically study dropout in two-layer neural networks trained with online gradient descent.
arXiv Detail & Related papers (2025-05-12T17:45:02Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
How adversarial attacks can disrupt seemingly stable accurate classifiers [76.95145661711514]
Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data. Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data. We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability.
arXiv Detail & Related papers (2023-09-07T12:02:00Z)
Dynamic Adaptive Threshold based Learning for Noisy Annotations Robust Facial Expression Recognition [3.823356975862006]
We propose a dynamic FER learning framework (DNFER) to handle noisy annotations. Specifically, DNFER is based on supervised training using selected clean samples and unsupervised consistent training using all the samples. We demonstrate the robustness of DNFER on both synthetic as well as on real noisy annotated FER datasets like RAFDB, FERPlus, SFEW and AffectNet.
arXiv Detail & Related papers (2022-08-22T12:02:41Z)
Robust Training under Label Noise by Over-parameterization [41.03008228953627]
We propose a principled approach for robust training of over-parameterized deep networks in classification tasks where a proportion of training labels are corrupted. The main idea is yet very simple: label noise is sparse and incoherent with the network learned from clean data, so we model the noise and learn to separate it from the data. Remarkably, when trained using such a simple method in practice, we demonstrate state-of-the-art test accuracy against label noise on a variety of real datasets.
arXiv Detail & Related papers (2022-02-28T18:50:10Z)
Distribution Mismatch Correction for Improved Robustness in Deep Neural Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions. We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer. In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z)
Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency) Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z)
Sampling-free Variational Inference for Neural Networks with Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference. Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.