Spectral Sentinel: Scalable Byzantine-Robust Decentralized Federated Learning via Sketched Random Matrix Theory on Blockchain
- URL: http://arxiv.org/abs/2512.12617v1
- Date: Sun, 14 Dec 2025 09:43:03 GMT
- Title: Spectral Sentinel: Scalable Byzantine-Robust Decentralized Federated Learning via Sketched Random Matrix Theory on Blockchain
- Authors: Animesh Mishra,
- Abstract summary: Byzantine clients poison gradients under heterogeneous (Non-IID) data.<n>We propose Spectral Sentinel, a Byzantine detection and aggregation framework.<n>We implement the full system with blockchain integration on Polygon networks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decentralized federated learning (DFL) enables collaborative model training without centralized trust, but it remains vulnerable to Byzantine clients that poison gradients under heterogeneous (Non-IID) data. Existing defenses face a scalability trilemma: distance-based filtering (e.g., Krum) can reject legitimate Non-IID updates, geometric-median methods incur prohibitive $O(n^2 d)$ cost, and many certified defenses are evaluated only on models below 100M parameters. We propose Spectral Sentinel, a Byzantine detection and aggregation framework that leverages a random-matrix-theoretic signature: honest Non-IID gradients produce covariance eigenspectra whose bulk follows the Marchenko-Pastur law, while Byzantine perturbations induce detectable tail anomalies. Our algorithm combines Frequent Directions sketching with data-dependent MP tracking, enabling detection on models up to 1.5B parameters using $O(k^2)$ memory with $k \ll d$. Under a $(σ,f)$ threat model with coordinate-wise honest variance bounded by $σ^2$ and $f < 1/2$ adversaries, we prove $(ε,δ)$-Byzantine resilience with convergence rate $O(σf / \sqrt{T} + f^2 / T)$, and we provide a matching information-theoretic lower bound $Ω(σf / \sqrt{T})$, establishing minimax optimality. We implement the full system with blockchain integration on Polygon networks and validate it across 144 attack-aggregator configurations, achieving 78.4 percent average accuracy versus 48-63 percent for baseline methods.
Related papers
- Not Just How Much, But Where: Decomposing Epistemic Uncertainty into Per-Class Contributions [1.2891210250935148]
In safety-critical classification, the cost of failure is often asymmetric.<n>We decompose MI into a per-class vector $C_k(x)=_k2/ (2_k)$, with $_k=mathbbE[p_k]$ and $_k2=mathrmVar[p_k]$ across posterior samples.
arXiv Detail & Related papers (2026-02-24T18:05:51Z) - Optimal Unconstrained Self-Distillation in Ridge Regression: Strict Improvements, Precise Asymptotics, and One-Shot Tuning [61.07540493350384]
Self-distillation (SD) is the process of retraining a student on a mixture of ground-truth and the teacher's own predictions.<n>We show that for any prediction risk, the optimally mixed student improves upon the ridge teacher for every regularization level.<n>We propose a consistent one-shot tuning method to estimate $star$ without grid search, sample splitting, or refitting.
arXiv Detail & Related papers (2026-02-19T17:21:15Z) - One-Shot Federated Ridge Regression: Exact Recovery via Sufficient Statistic Aggregation [0.7106986689736825]
Federated ridge regression is a distributed equilibrium problem where each client computes local sufficient statistics and transmits them once.<n>We establish differential privacy guarantees where noise is injected once per client, eliminating the composition penalty that degrades privacy in multi-round protocols.<n>Experiments on synthetic heterogeneous regression demonstrate that one-shot fusion matches FedAvg accuracy while requiring up to $38times$ less communication.
arXiv Detail & Related papers (2026-01-13T04:47:22Z) - Hierarchical Federated Graph Attention Networks for Scalable and Resilient UAV Collision Avoidance [0.5505634045241287]
Real-time performance, adversarial resiliency, and privacy preservation are the most important metrics that need to be balanced to practice collision avoidance.<n>We have proposed an adaptive differential privacy mechanism, wherein the noise level $(in [0.1, 1.0])$ is dynamically reduced based on an evaluation of the measured real-time threat.<n>This architecture provides a scalable scenario of 500 UAVs with a collision rate of $ 2.0%$ and the Byzantine fault tolerance of $f n/3$.
arXiv Detail & Related papers (2025-11-05T12:01:00Z) - SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening [15.287835378843425]
Decentralized Federated Learning (DFL) enables privacy-preserving collaborative training without centralized servers.<n>DFL is vulnerable to Byzantine attacks where malicious clients update corrupted model.<n>We propose SketchGuard to decouple Byzantine filtering from model aggregation through sketch-based neighbor screening.
arXiv Detail & Related papers (2025-10-09T08:16:32Z) - Near-Optimal Clustering in Mixture of Markov Chains [74.3828414695655]
We study the problem of clustering $T$ trajectories of length $H$, each generated by one of $K$ unknown ergodic Markov chains over a finite state space of size $S$.<n>We derive an instance-dependent, high-probability lower bound on the clustering error rate, governed by the weighted KL divergence between the transition kernels of the chains.<n>We then present a novel two-stage clustering algorithm.
arXiv Detail & Related papers (2025-06-02T05:10:40Z) - Beyond likelihood ratio bias: Nested multi-time-scale stochastic approximation for likelihood-free parameter estimation [49.78792404811239]
We study inference in simulation-based models where the analytical form of the likelihood is unknown.<n>We use a ratio-free nested multi-time-scale approximation (SA) method that simultaneously tracks the score and drives the parameter update.<n>We show that our algorithm can eliminate the original bias $Obig(sqrtfrac1Nbig)$ and accelerate the convergence rate from $Obig(beta_k+sqrtfracalpha_kNbig)$.
arXiv Detail & Related papers (2024-11-20T02:46:15Z) - Certifiably Robust Model Evaluation in Federated Learning under Meta-Distributional Shifts [8.700087812420687]
We provide guarantees for the model's performance on a different, unseen network "B"<n>We show how the principled vanilla DKW bound enables certification of the model's true performance on unseen clients within the same (source) network.
arXiv Detail & Related papers (2024-10-26T18:45:15Z) - Contextual Combinatorial Bandits with Probabilistically Triggered Arms [55.9237004478033]
We study contextual bandits with probabilistically triggered arms (C$2$MAB-T) under a variety of smoothness conditions.
Under the triggering modulated (TPM) condition, we devise the C$2$-UC-T algorithm and derive a regret bound $tildeO(dsqrtT)$.
arXiv Detail & Related papers (2023-03-30T02:51:00Z) - Optimal Membership Inference Bounds for Adaptive Composition of Sampled
Gaussian Mechanisms [93.44378960676897]
Given a trained model and a data sample, membership-inference (MI) attacks predict whether the sample was in the model's training set.
A common countermeasure against MI attacks is to utilize differential privacy (DP) during model training to mask the presence of individual examples.
In this paper, we derive bounds for the textitadvantage of an adversary mounting a MI attack, and demonstrate tightness for the widely-used Gaussian mechanism.
arXiv Detail & Related papers (2022-04-12T22:36:56Z) - Settling the Sample Complexity of Model-Based Offline Reinforcement
Learning [50.5790774201146]
offline reinforcement learning (RL) learns using pre-collected data without further exploration.
Prior algorithms or analyses either suffer from suboptimal sample complexities or incur high burn-in cost to reach sample optimality.
We demonstrate that the model-based (or "plug-in") approach achieves minimax-optimal sample complexity without burn-in cost.
arXiv Detail & Related papers (2022-04-11T17:26:19Z) - An Improved Analysis of Gradient Tracking for Decentralized Machine
Learning [34.144764431505486]
We consider decentralized machine learning over a network where the training data is distributed across $n$ agents.
The agent's common goal is to find a model that minimizes the average of all local loss functions.
We improve the dependency on $p$ from $mathcalO(p-1)$ to $mathcalO(p-1)$ in the noiseless case.
arXiv Detail & Related papers (2022-02-08T12:58:14Z) - Toward Adversarial Robustness via Semi-supervised Robust Training [93.36310070269643]
Adrial examples have been shown to be the severe threat to deep neural networks (DNNs)
We propose a novel defense method, the robust training (RT), by jointly minimizing two separated risks ($R_stand$ and $R_rob$)
arXiv Detail & Related papers (2020-03-16T02:14:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.