Enabling Replicated State Machines to Communicate Efficiently

Replicated state machines (RSMs) cannot effectively communicate today as there is no formal framework or efficient protocol to do so. To address this issue, we introduce a new primitive, the Cross-Cluster Consistent Broadcast (C3B) and present SCROOGE, a practical C3B implementation. SCROOGE draws inspiration from networking and TCP to allow two RSMs to communicate with constant metadata overhead in the failure-free case and minimal number of message resends in the case of failures. SCROOGE is flexible and allows both crash fault tolerant and Byzantine fault tolerant protocols to communicate. At the heart of SCROOGE’s good performance and generality lies a novel technique we call QUACKs (quorum acknowledgements) that allow nodes in each RSM to precisely determine when messages have definitely been received, or likely lost. Our results are promising: we obtain up to 24× better performance than prior solutions.


Micah Murray, Suyash Gupta, Ethan Xu, Chawinphat Tankuranand, Junseo Yoo, Natacha Crooks, Manos Kapritsos