Dissertation Talk: Algebraic Approaches to Distributed Data Systems – Conor Power

Speaker: Conor Power
Advisor: Joseph Hellerstein
Date: Thursday, May 15, 2025
Time: 10:00AM – 11:00AM PDT

Location: Woz Lounge

Abstract:
With the rise of cloud computing, software systems have become increasingly distributed. Distributed systems offer myriad benefits such as scalability, availability, and fault tolerance. However, they introduce complexity for the programmers of these systems to ensure correctness and hide non-determinism from the end-user. To address this challenge of programming cloud-scale systems, the Hydro project at Berkeley explores bringing declarative programming to the distributed systems space. Declarative programming has had enormous success in the field of databases in the form of SQL. Its benefit is that it allows developers to specify their goals at a high level and leave complex implementation decisions up to the database system. 

In this talk, we explore the marriage of these two worlds: distributed systems programming and declarative database systems. In pursuit of this marriage, we study independent trends towards algebraic models in distributed systems and database systems. In particular, we study four lines of research on algebraic properties for distributed data systems: conflict-free replicated data types (CRDTs), algebraic models of incremental view maintenance (IVM), parallel database aggregates, and the CALM Theorem. While these topics have been studied under different formalisms across different research communities, we are able to build bridges between them. We are able to bring the system model and mathematical model of CRDTs, studied in the distributed systems and programming languages communities, to these three other topics that have been studied entirely within the databases research community. The result is a foundation on which to support the benefits of declarativity in distributed systems programming.