Angelfish: Consensus with Optimal Throughput and Latency Across the Leader-DAG Spectrum
- URL: http://arxiv.org/abs/2509.15847v1
- Date: Fri, 19 Sep 2025 10:30:25 GMT
- Title: Angelfish: Consensus with Optimal Throughput and Latency Across the Leader-DAG Spectrum
- Authors: Qianyu Yu, Giuliano Losa, Nibesh Shrestha, Xuechao Wang,
- Abstract summary: We present Angelfish, a hybrid protocol that adapts smoothly across this design space.<n>Angelfish lets a dynamically-adjusted subset of parties use best-effort broadcast to issue lightweight votes instead of reliably broadcasting DAG vertices.<n>Our empirical evaluation shows that Angelfish attains state-of-the-art peak throughput while matching the latency of leader-based protocols under moderate throughput.
- Score: 3.940687402522194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To maximize performance, many modern blockchain systems rely on eventually-synchronous, Byzantine fault-tolerant (BFT) consensus protocols. Two protocol designs have emerged in this space: protocols that minimize latency using a leader that drives both data dissemination and consensus, and protocols that maximize throughput using a separate, asynchronous data dissemination layer. Recent protocols such as Partially-Synchronous Bullshark and Sailfish combine elements of both approaches by using a DAG to enable parallel data dissemination and a leader that paces DAG formation. This improves latency while achieving state-of-the-art throughput. Yet the latency of leader-based protocols is still better under moderate loads. We present Angelfish, a hybrid protocol that adapts smoothly across this design space, from leader-based to Sailfish-like DAG-based consensus. Angelfish lets a dynamically-adjusted subset of parties use best-effort broadcast to issue lightweight votes instead of reliably broadcasting costlier DAG vertices. This reduces communication, helps lagging nodes catch up, and lowers latency in practice compared to prior DAG-based protocols. Our empirical evaluation shows that Angelfish attains state-of-the-art peak throughput while matching the latency of leader-based protocols under moderate throughput, delivering the best of both worlds.
Related papers
- Hierarchical Federated Learning with SignSGD: A Highly Communication-Efficient Approach [16.51305515824504]
Hierarchical edge learning (HFL) has emerged as a key for large-scale wireless and Internet of Things systems.<n>One method such as sign-based gradient descent (SignSGD) offer an essential solution, but existing theory and algorithms do not naturally extend to hierarchical settings.<n>We introduce a scalable HFL algorithm, HierSignSGD, and provide the convergence analysis for SignSGD in a hierarchical setting.
arXiv Detail & Related papers (2026-02-02T17:18:03Z) - Nesterov Method for Asynchronous Pipeline Parallel Optimization [59.79227116582264]
We introduce a variant of Nesterov Accelerated Gradient (NAG) for asynchronous optimization in Pipeline Parallelism.<n>Specifically, we modify the look-ahead step in NAG to effectively address the staleness in gradients.<n>We theoretically prove that our approach converges at a sublinear rate in the presence of fixed delay in gradients.
arXiv Detail & Related papers (2025-05-02T08:23:29Z) - Boosting Asynchronous Decentralized Learning with Model Fragmentation [1.6053176639259055]
DivShare is a novel DL algorithm that achieves fast model convergence in the presence of communication stragglers.<n>We experimentally evaluate DivShare against two state-of-the-art DL baselines, AD-PSGD and Swift.<n>We find that DivShare with communication stragglers lowers time-to-accuracy by up to 3.9x compared to AD-PSGD on the CIFAR-10 dataset.
arXiv Detail & Related papers (2024-10-16T18:03:52Z) - FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency.
We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs)
We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z) - Asynchronous Stochastic Gradient Descent with Decoupled Backpropagation and Layer-Wise Updates [1.9241821314180372]
Asynchronous gradient descent (ASGD) methods can improve training speed, but are sensitive to delays due to both communication and throughput differences.<n>PD-ASGD uses separate threads for the forward and backward passes, decoupling the updates and allowing for a higher ratio of forward to backward threads.<n>Our approach yields close to state-of-the-art results while running up to $5.95times$ faster than synchronous data parallelism in the presence of delays.
arXiv Detail & Related papers (2024-10-08T12:32:36Z) - Adelie: Detection and prevention of Byzantine behaviour in DAG-based consensus protocols [0.0]
Recent developments in Byzantine Fault Tolerant consensus protocols have shown the DAG-based protocols to be a very promising technique.
The latest versions of DAG-based protocols such as Mysticeti and Shoal++ show that indeed a latency comparable to that of traditional consensus protocols such as HotStuff can be achieve.
This paper presents an implementation of Adelie protocol - bftd that demonstrates yet another breakthrough in the maximum achieved TPS and low latency.
arXiv Detail & Related papers (2024-08-04T11:56:28Z) - OFDM-Standard Compatible SC-NOFS Waveforms for Low-Latency and Jitter-Tolerance Industrial IoT Communications [53.398544571833135]
This work proposes a spectrally efficient irregular Sinc (irSinc) shaping technique, revisiting the traditional Sinc back to 1924.
irSinc yields a signal with increased spectral efficiency without sacrificing error performance.
Our signal achieves faster data transmission within the same spectral bandwidth through 5G standard signal configuration.
arXiv Detail & Related papers (2024-06-07T09:20:30Z) - Client Orchestration and Cost-Efficient Joint Optimization for
NOMA-Enabled Hierarchical Federated Learning [55.49099125128281]
We propose a non-orthogonal multiple access (NOMA) enabled HFL system under semi-synchronous cloud model aggregation.
We show that the proposed scheme outperforms the considered benchmarks regarding HFL performance improvement and total cost reduction.
arXiv Detail & Related papers (2023-11-03T13:34:44Z) - Mysticeti: Reaching the Limits of Latency with Uncertified DAGs [5.328717371685882]
We introduce Mysticeti-C, the first DAG-based Byzantine consensus protocol to achieve the lower bounds of latency of 3 message rounds.
We extend Mysticeti-C to Mysticeti-FPC, which incorporates a fast commit path that achieves even lower latency for transferring assets.
arXiv Detail & Related papers (2023-10-23T11:40:50Z) - Fair and Efficient Distributed Edge Learning with Hybrid Multipath TCP [62.81300791178381]
The bottleneck of distributed edge learning over wireless has shifted from computing to communication.
Existing TCP-based data networking schemes for DEL are application-agnostic and fail to deliver adjustments according to application layer requirements.
We develop a hybrid multipath TCP (MP TCP) by combining model-based and deep reinforcement learning (DRL) based MP TCP for DEL.
arXiv Detail & Related papers (2022-11-03T09:08:30Z) - OFedQIT: Communication-Efficient Online Federated Learning via
Quantization and Intermittent Transmission [7.6058140480517356]
Online federated learning (OFL) is a promising framework to collaboratively learn a sequence of non-linear functions (or models) from distributed streaming data.
We propose a communication-efficient OFL algorithm (named OFedQIT) by means of a quantization and an intermittent transmission.
Our analysis reveals that OFedQIT successfully addresses the drawbacks of OFedAvg while maintaining superior learning accuracy.
arXiv Detail & Related papers (2022-05-13T07:46:43Z) - Detached Error Feedback for Distributed SGD with Random Sparsification [98.98236187442258]
Communication bottleneck has been a critical problem in large-scale deep learning.
We propose a new distributed error feedback (DEF) algorithm, which shows better convergence than error feedback for non-efficient distributed problems.
We also propose DEFA to accelerate the generalization of DEF, which shows better bounds than DEF.
arXiv Detail & Related papers (2020-04-11T03:50:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.