Sky Seminar Series: Atul Adya (Databricks) – Pulling distributed caches out of the Dark Ages

Speaker: Atul Adya
Location: Soda 510
Date: February 9, 2024
Time: 12-1pm PST

Title:
Pulling distributed caches out of the Dark Ages

Abstract:

Caching is a fundamental building block in computer systems. When it comes to distributed caching in data centers, however, it is far from a solved problem. Even the most advanced systems today suffer from a myriad of issues including persistent hotspotting due to popular keys, unavailability and state loss during restarts, and inefficiency from polling the database to keep data fresh. In this talk, I hope to convince you that this need not be the case, and that all three of low latency, low cost, and high scale are achievable in a reliable manner. I will present two critical components that are missing in most data center environments which make this possible – an auto-sharder, Dicer, and a cache freshness system, Snappy. Together these abstractions allow application writers to build remote caches and stateful services that are resilient to failures and are highly performant. Finally, I will present several important unsolved problems in the area of distributed caching where deep research is needed.

Bio:
Atul Adya is currently a software engineer at Databricks. Previously, at Google, he was responsible for the distributed caching infrastructure systems used across the company. These systems are widely used across Google and handle billions of requests per second. Prior to Google, he worked at Microsoft Research and Microsoft product groups in the areas of distributed file systems, wireless diagnostics, object-relational mappings, and distributed caching. He received his S.M. and PhD degrees from the Massachusetts Institute of Technology where he worked in the area of transaction management and weak isolation levels.