Related papers: Angelfish: Consensus with Optimal Throughput and Latency Across the Leader-DAG Spectrum

Angelfish: Consensus with Optimal Throughput and Latency Across the Leader-DAG Spectrum

URL: http://arxiv.org/abs/2509.15847v1
Date: Fri, 19 Sep 2025 10:30:25 GMT
Title: Angelfish: Consensus with Optimal Throughput and Latency Across the Leader-DAG Spectrum
Authors: Qianyu Yu, Giuliano Losa, Nibesh Shrestha, Xuechao Wang,
Abstract summary: We present Angelfish, a hybrid protocol that adapts smoothly across this design space.<n>Angelfish lets a dynamically-adjusted subset of parties use best-effort broadcast to issue lightweight votes instead of reliably broadcasting DAG vertices.<n>Our empirical evaluation shows that Angelfish attains state-of-the-art peak throughput while matching the latency of leader-based protocols under moderate throughput.
Score: 3.940687402522194
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To maximize performance, many modern blockchain systems rely on eventually-synchronous, Byzantine fault-tolerant (BFT) consensus protocols. Two protocol designs have emerged in this space: protocols that minimize latency using a leader that drives both data dissemination and consensus, and protocols that maximize throughput using a separate, asynchronous data dissemination layer. Recent protocols such as Partially-Synchronous Bullshark and Sailfish combine elements of both approaches by using a DAG to enable parallel data dissemination and a leader that paces DAG formation. This improves latency while achieving state-of-the-art throughput. Yet the latency of leader-based protocols is still better under moderate loads. We present Angelfish, a hybrid protocol that adapts smoothly across this design space, from leader-based to Sailfish-like DAG-based consensus. Angelfish lets a dynamically-adjusted subset of parties use best-effort broadcast to issue lightweight votes instead of reliably broadcasting costlier DAG vertices. This reduces communication, helps lagging nodes catch up, and lowers latency in practice compared to prior DAG-based protocols. Our empirical evaluation shows that Angelfish attains state-of-the-art peak throughput while matching the latency of leader-based protocols under moderate throughput, delivering the best of both worlds.

Related papers

Hierarchical Federated Learning with SignSGD: A Highly Communication-Efficient Approach [16.51305515824504]
Hierarchical edge learning (HFL) has emerged as a key for large-scale wireless and Internet of Things systems.<n>One method such as sign-based gradient descent (SignSGD) offer an essential solution, but existing theory and algorithms do not naturally extend to hierarchical settings.<n>We introduce a scalable HFL algorithm, HierSignSGD, and provide the convergence analysis for SignSGD in a hierarchical setting.
arXiv Detail & Related papers (2026-02-02T17:18:03Z)
Nesterov Method for Asynchronous Pipeline Parallel Optimization [59.79227116582264]
We introduce a variant of Nesterov Accelerated Gradient (NAG) for asynchronous optimization in Pipeline Parallelism.<n>Specifically, we modify the look-ahead step in NAG to effectively address the staleness in gradients.<n>We theoretically prove that our approach converges at a sublinear rate in the presence of fixed delay in gradients.
arXiv Detail & Related papers (2025-05-02T08:23:29Z)
Boosting Asynchronous Decentralized Learning with Model Fragmentation [1.6053176639259055]
DivShare is a novel DL algorithm that achieves fast model convergence in the presence of communication stragglers.<n>We experimentally evaluate DivShare against two state-of-the-art DL baselines, AD-PSGD and Swift.<n>We find that DivShare with communication stragglers lowers time-to-accuracy by up to 3.9x compared to AD-PSGD on the CIFAR-10 dataset.
arXiv Detail & Related papers (2024-10-16T18:03:52Z)
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency. We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs) We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z)
Asynchronous Stochastic Gradient Descent with Decoupled Backpropagation and Layer-Wise Updates [1.9241821314180372]
Asynchronous gradient descent (ASGD) methods can improve training speed, but are sensitive to delays due to both communication and throughput differences.<n>PD-ASGD uses separate threads for the forward and backward passes, decoupling the updates and allowing for a higher ratio of forward to backward threads.<n>Our approach yields close to state-of-the-art results while running up to $5.95times$ faster than synchronous data parallelism in the presence of delays.
arXiv Detail & Related papers (2024-10-08T12:32:36Z)
Adelie: Detection and prevention of Byzantine behaviour in DAG-based consensus protocols [0.0]
Recent developments in Byzantine Fault Tolerant consensus protocols have shown the DAG-based protocols to be a very promising technique. The latest versions of DAG-based protocols such as Mysticeti and Shoal++ show that indeed a latency comparable to that of traditional consensus protocols such as HotStuff can be achieve. This paper presents an implementation of Adelie protocol - bftd that demonstrates yet another breakthrough in the maximum achieved TPS and low latency.
arXiv Detail & Related papers (2024-08-04T11:56:28Z)
OFDM-Standard Compatible SC-NOFS Waveforms for Low-Latency and Jitter-Tolerance Industrial IoT Communications [53.398544571833135]
This work proposes a spectrally efficient irregular Sinc (irSinc) shaping technique, revisiting the traditional Sinc back to 1924. irSinc yields a signal with increased spectral efficiency without sacrificing error performance. Our signal achieves faster data transmission within the same spectral bandwidth through 5G standard signal configuration.
arXiv Detail & Related papers (2024-06-07T09:20:30Z)
Client Orchestration and Cost-Efficient Joint Optimization for NOMA-Enabled Hierarchical Federated Learning [55.49099125128281]
We propose a non-orthogonal multiple access (NOMA) enabled HFL system under semi-synchronous cloud model aggregation. We show that the proposed scheme outperforms the considered benchmarks regarding HFL performance improvement and total cost reduction.
arXiv Detail & Related papers (2023-11-03T13:34:44Z)
Mysticeti: Reaching the Limits of Latency with Uncertified DAGs [5.328717371685882]
We introduce Mysticeti-C, the first DAG-based Byzantine consensus protocol to achieve the lower bounds of latency of 3 message rounds. We extend Mysticeti-C to Mysticeti-FPC, which incorporates a fast commit path that achieves even lower latency for transferring assets.
arXiv Detail & Related papers (2023-10-23T11:40:50Z)
Fair and Efficient Distributed Edge Learning with Hybrid Multipath TCP [62.81300791178381]
The bottleneck of distributed edge learning over wireless has shifted from computing to communication. Existing TCP-based data networking schemes for DEL are application-agnostic and fail to deliver adjustments according to application layer requirements. We develop a hybrid multipath TCP (MP TCP) by combining model-based and deep reinforcement learning (DRL) based MP TCP for DEL.
arXiv Detail & Related papers (2022-11-03T09:08:30Z)
OFedQIT: Communication-Efficient Online Federated Learning via Quantization and Intermittent Transmission [7.6058140480517356]
Online federated learning (OFL) is a promising framework to collaboratively learn a sequence of non-linear functions (or models) from distributed streaming data. We propose a communication-efficient OFL algorithm (named OFedQIT) by means of a quantization and an intermittent transmission. Our analysis reveals that OFedQIT successfully addresses the drawbacks of OFedAvg while maintaining superior learning accuracy.
arXiv Detail & Related papers (2022-05-13T07:46:43Z)
Detached Error Feedback for Distributed SGD with Random Sparsification [98.98236187442258]
Communication bottleneck has been a critical problem in large-scale deep learning. We propose a new distributed error feedback (DEF) algorithm, which shows better convergence than error feedback for non-efficient distributed problems. We also propose DEFA to accelerate the generalization of DEF, which shows better bounds than DEF.
arXiv Detail & Related papers (2020-04-11T03:50:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.