Related papers: Binary perceptron computational gap -- a parametric fl RDT view

Binary perceptron computational gap -- a parametric fl RDT view

URL: http://arxiv.org/abs/2511.01037v1
Date: Sun, 02 Nov 2025 18:23:49 GMT
Title: Binary perceptron computational gap -- a parametric fl RDT view
Authors: Mihailo Stojnic,
Abstract summary: asymmetric binary perceptron (ABP) likely exhibits two phase transitioning constraint density thresholds.<n>We consider a particular parametric utilization of emphfully lifted random duality theory (fl RDT) [85] and study its potential ABP's algorithmic implications.
Score: 2.538209532048867
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent studies suggest that asymmetric binary perceptron (ABP) likely exhibits the so-called statistical-computational gap characterized with the appearance of two phase transitioning constraint density thresholds: \textbf{\emph{(i)}} the \emph{satisfiability threshold} $\alpha_c$, below/above which ABP succeeds/fails to operate as a storage memory; and \textbf{\emph{(ii)}} \emph{algorithmic threshold} $\alpha_a$, below/above which one can/cannot efficiently determine ABP's weight so that it operates as a storage memory. We consider a particular parametric utilization of \emph{fully lifted random duality theory} (fl RDT) [85] and study its potential ABP's algorithmic implications. A remarkable structural parametric change is uncovered as one progresses through fl RDT lifting levels. On the first two levels, the so-called $\c$ sequence -- a key parametric fl RDT component -- is of the (natural) decreasing type. A change of such phenomenology on higher levels is then connected to the $\alpha_c$ -- $\alpha_a$ threshold change. Namely, on the second level concrete numerical values give for the critical constraint density $\alpha=\alpha_c\approx 0.8331$. While progressing through higher levels decreases this estimate, already on the fifth level we observe a satisfactory level of convergence and obtain $\alpha\approx 0.7764$. This allows to draw two striking parallels: \textbf{\emph{(i)}} the obtained constraint density estimate is in a remarkable agrement with range $\alpha\in (0.77,0.78)$ of clustering defragmentation (believed to be responsible for failure of locally improving algorithms) [17,88]; and \textbf{\emph{(ii)}} the observed change of $\c$ sequence phenomenology closely matches the one of the negative Hopfield model for which the existence of efficient algorithms that closely approach similar type of threshold has been demonstrated recently [87].

Related papers

Parametric RDT approach to computational gap of symmetric binary perceptron [2.538209532048867]
We study potential presence of statisticalcomputational gaps (SCG) in symmetric binary perceptrons (SBP) via a parametric utilization of emphfully lifted random duality theory.<n>A structural change from decreasingly to arbitrarily ordered $c$-sequence is observed on the second lifting level and associated with emphsatisfiability.
arXiv Detail & Related papers (2026-01-15T17:48:58Z)
Theoretical Framework for Tempered Fractional Gradient Descent: Application to Breast Cancer Classification [0.0]
This paper introduces Fractional Gradient Descent (TFGD), a novel optimization framework that synergizes fractional calculus with exponential tempering to enhance gradient-based learning.<n>TFGD addresses limitations by incorporating a tempered memory mechanism, where historical gradients are weighted by fractional coefficients $|w_j| = binomalphaj$ and exponentially decayed via a tempering parameter $lambda$.<n> Empirical validation on the Breast Cancer Wisconsin dataset demonstrates TFGD's superiority, achieving 98.25% test accuracy (vs. 92.11% for SGD) and 2$times$ faster convergence.
arXiv Detail & Related papers (2025-04-26T08:26:34Z)
A Nearly Optimal Single Loop Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness [15.656614304616006]
This paper studies the problem of bilevel optimization where the upper-level function is nonstationary with potentially unbounded smoothness and the lower-level function is convex.<n>Existing algorithm relies on a nested loop, which crucially requires significant tuning efforts and is not practical.
arXiv Detail & Related papers (2024-12-28T04:40:27Z)
Beyond likelihood ratio bias: Nested multi-time-scale stochastic approximation for likelihood-free parameter estimation [49.78792404811239]
We study inference in simulation-based models where the analytical form of the likelihood is unknown.<n>We use a ratio-free nested multi-time-scale approximation (SA) method that simultaneously tracks the score and drives the parameter update.<n>We show that our algorithm can eliminate the original bias $Obig(sqrtfrac1Nbig)$ and accelerate the convergence rate from $Obig(beta_k+sqrtfracalpha_kNbig)$.
arXiv Detail & Related papers (2024-11-20T02:46:15Z)
From Gradient Clipping to Normalization for Heavy Tailed SGD [19.369399536643773]
Recent empirical evidence indicates that machine learning applications involve heavy-tailed noise, which challenges the standard assumptions of bounded variance in practice.<n>In this paper, we show that it is possible to achieve tightness of the gradient-dependent noise convergence problem under tailed noise.
arXiv Detail & Related papers (2024-10-17T17:59:01Z)
Theoretical limits of descending $\ell_0$ sparse-regression ML algorithms [0.0]
We develop a generic analytical program for studying performance of the emphmaximum-likelihood (ML) decoding. Key ML performance parameter, the residual emphroot mean square error ($textbfRMSE$) is uncovered to exhibit the so-called emphphase-transition (PT) phenomenon. Concrete implementation and practical relevance of the Fl RDT typically rely on the ability to conduct a sizeable set of the underlying numerical evaluations.
arXiv Detail & Related papers (2024-10-10T06:33:41Z)
An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness [15.656614304616006]
This paper investigates a class of bilevel optimization problems where the upper-level function is non convex and the lower-level problem is strongly convex.<n>These problems have significant applications in data learning, such as text classification using unbounded networks.<n>We propose a new Accelerated Bilevel Optimization algorithm named AccBO.
arXiv Detail & Related papers (2024-09-28T02:30:44Z)
Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training. Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z)
Stochastic Gradient Succeeds for Bandits [64.17904367852563]
We show that the emphstochastic gradient bandit algorithm converges to a emphglobally optimal policy at an $O (1/t)$ rate. Remarkably, global convergence of the gradient bandit algorithm has not been previously established.
arXiv Detail & Related papers (2024-02-27T06:05:01Z)
Asynchronous Training Schemes in Distributed Learning with Time Delay [17.259708772713164]
In the context of distributed deep learning, the issue of stale weights or gradients could result in poor algorithmic performance. In this paper, we propose a different approach to tackle the issue of stale weights or gradients. One practical variant of PC-ASGD is also proposed by adopting a condition to help with the determination of the tradeoff parameter.
arXiv Detail & Related papers (2022-08-28T07:14:59Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$. We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z)
Rate-matching the regret lower-bound in the linear quadratic regulator with unknown dynamics [6.287145010885044]
This paper establishes a novel regret upper-bound of $O_p(sqrtT,textpolylog(T))$. It simultaneously establishes an estimation error bound on the dynamics of $O_p(sqrtT,textpolylog(T))$.
arXiv Detail & Related papers (2022-02-11T17:50:14Z)
Differentially Quantized Gradient Methods [53.3186247068836]
We show that Differentially Quantized Gradient Descent (DQ-GD) attains a linear contraction factor of $maxsigma_mathrmGD, rhon 2-R$. No algorithm within a certain class can converge faster than $maxsigma_mathrmGD, 2-R$.
arXiv Detail & Related papers (2020-02-06T20:40:53Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks. In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems. Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.