Related papers: The Cost of Robustness: Tighter Bounds on Parameter Complexity for Robust Memorization in ReLU Nets

The Cost of Robustness: Tighter Bounds on Parameter Complexity for Robust Memorization in ReLU Nets

URL: http://arxiv.org/abs/2510.24643v1
Date: Tue, 28 Oct 2025 17:09:43 GMT
Title: The Cost of Robustness: Tighter Bounds on Parameter Complexity for Robust Memorization in ReLU Nets
Authors: Yujun Kim, Chaewon Moon, Chulhee Yun,
Abstract summary: We study the parameter complexity of robust memorization for $mathrmReLU$ networks.<n>We establish upper and lower bounds on the parameter count as a function of the robustness ratio $rho = mu / epsilon$.
Score: 22.963810255498796
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the parameter complexity of robust memorization for $\mathrm{ReLU}$ networks: the number of parameters required to interpolate any given dataset with $\epsilon$-separation between differently labeled points, while ensuring predictions remain consistent within a $\mu$-ball around each training sample. We establish upper and lower bounds on the parameter count as a function of the robustness ratio $\rho = \mu / \epsilon$. Unlike prior work, we provide a fine-grained analysis across the entire range $\rho \in (0,1)$ and obtain tighter upper and lower bounds that improve upon existing results. Our findings reveal that the parameter complexity of robust memorization matches that of non-robust memorization when $\rho$ is small, but grows with increasing $\rho$.

Related papers

Rate optimal learning of equilibria from data [63.14746189846806]
We close theoretical gaps in Multi-Agent Imitation Learning (MAIL) by characterizing the limits of non-interactive MAIL and presenting the first interactive algorithm with near-optimal sample complexity.<n>For the interactive setting, we introduce a framework that combines reward-free reinforcement learning with interactive MAIL and instantiate it with an algorithm, MAIL-WARM.<n>We provide numerical results that support our theory and illustrate, in environments such as grid worlds, where Behavior Cloning fails to learn.
arXiv Detail & Related papers (2025-10-10T12:28:35Z)
Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity [14.396304498754688]
We show a novel notion of sparsity that we dub $(lambda, beta)$-sparsity.<n>In short, there is a set of at most $beta$ groups whose risks at $theta$ all are at least $lambda$ larger than the risks of the other groups.<n>We show how to get a dimension-free semi-adaptive sample complexity bound with a computationally efficient method.
arXiv Detail & Related papers (2024-10-01T13:45:55Z)
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel. An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost. Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z)
Learning Thresholds with Latent Values and Censored Feedback [18.129896050051432]
We show a problem where the unknown reward $g(gamma, v)$ depends on the proposed threshold $gamma$ and latent value $v$ and it can be $only$ achieved if the threshold is lower than or equal to the unknown latent value. This problem has broad applications in practical scenarios, e.g., reserve price optimization in online auctions, online task assignments in crowdsourcing, setting recruiting bars in hiring.
arXiv Detail & Related papers (2023-12-07T19:30:08Z)
On the Query Complexity of Training Data Reconstruction in Private Learning [0.0]
We analyze the number of queries that a whitebox adversary needs to make to a private learner in order to reconstruct its training data. For $(epsilon, delta)$ DP learners with training data drawn from any arbitrary compact metric space, we provide the emphfirst known lower bounds on the adversary's query complexity.
arXiv Detail & Related papers (2023-03-29T00:49:38Z)
Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization [49.58290066287418]
We propose a novel method named Multi-block-probe Variance Reduced (MSVR) to alleviate the complexity of compositional problems. Our results improve upon prior ones in several aspects, including the order of sample complexities and dependence on strongity.
arXiv Detail & Related papers (2022-07-18T12:03:26Z)
On the Optimal Memorization Power of ReLU Neural Networks [53.15475693468925]
We show that feedforward ReLU neural networks can memorization any $N$ points that satisfy a mild separability assumption. We prove that having such a large bit complexity is both necessary and sufficient for memorization with a sub-linear number of parameters.
arXiv Detail & Related papers (2021-10-07T05:25:23Z)
Under-bagging Nearest Neighbors for Imbalanced Classification [63.026765294759876]
We propose an ensemble learning algorithm called textitunder-bagging $k$-NN (textitunder-bagging $k$-NN) for imbalanced classification problems.
arXiv Detail & Related papers (2021-09-01T14:10:38Z)
Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models [56.98280399449707]
We show that there exists an $epsilon$-cover for $S$ of cardinality $M = (k/epsilon)O_d(k1/d)$. Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models hidden variables.
arXiv Detail & Related papers (2020-12-14T18:14:08Z)
Explicit Best Arm Identification in Linear Bandits Using No-Regret Learners [17.224805430291177]
We study the problem of best arm identification in linearly parameterised multi-armed bandits. We propose an explicitly implementable and provably order-optimal sample-complexity algorithm to solve this problem.
arXiv Detail & Related papers (2020-06-13T05:00:01Z)
Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation [30.137884459159107]
We consider the question of learning $Q$-function in a sample efficient manner for reinforcement learning with continuous state and action spaces. We develop a simple, iterative learning algorithm that finds $epsilon$-Schmidt $Q$-function with sample complexity of $widetildeO(frac1epsilonmax(d1), d_2)+2)$ when the optimal $Q$-function has low rank $r$ and the factor $$ is below a certain threshold.
arXiv Detail & Related papers (2020-06-11T00:55:35Z)
A Randomized Algorithm to Reduce the Support of Discrete Measures [79.55586575988292]
Given a discrete probability measure supported on $N$ atoms and a set of $n$ real-valued functions, there exists a probability measure that is supported on a subset of $n+1$ of the original $N$ atoms. We give a simple geometric characterization of barycenters via negative cones and derive a randomized algorithm that computes this new measure by "greedy geometric sampling" We then study its properties, and benchmark it on synthetic and real-world data to show that it can be very beneficial in the $Ngg n$ regime.
arXiv Detail & Related papers (2020-06-02T16:38:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.