Speaker: Raghu Ramakrishnan
Location: Soda 510
Date: April 19, 2024
Time: 12-1pm PST
Title:
Open Lakes, Converged Platforms: Data Foundations for the Era of AI
Abstract:
Web companies were at the forefront of observing core processes (search ranking, ad placement, personalized browsing) through telemetry, and optimizing these processes through the lens of that data. This, together with the emergence of cloud computing, led to a revolution in scale-out data platforms. The pattern of aggregating all relevant data and using insights to drive decisions, well-known in the database world, is now the key to unlocking the power of Gen AI for essentially every enterprise, commercial, social, or scientific. We have seen the evolution of data marts, warehouses, and Lakes as models for implementing this pattern. All aim to aggregate data and support specific tools; we are at a point where ALL tools must be brought to bear easily, without unnecessary data copying or format wrestling.
To this end, we must make our data aggregation, or lake, truly open. We must simplify the mechanics of data aggregation, and bring together the wide range of complex data platforms, if we are to make all relevant data accessible at scale. And we must do it in a way that supports governance across the entire data estate. The emergence of Gen AI has raised the stakes. The value of data has never been higher. The potential for democratization access has never been higher. In this talk, I will discuss how the data landscape is changing, from the perspective of data platforms, data governance, and of course, Gen AI.
Bio:
Raghu Ramakrishnan is CTO for Data, and a Technical Fellow at Microsoft. Previously, he was a professor at University of Wisconsin-Madison, where he wrote the widely used text “Database Management Systems” with Johannes Gehrke, and Chief Scientist at Yahoo! He has received the Innovation Award from both ACM SIGMOD and SIGKDD, multiple 10-year paper awards, and the ACM SIGMOD Contributions Award.