Sky Computing
Towards Utility Computing for the Cloud

Sponsors








Affiliated Companies

News
October 23, 2023
Professors Alvin Cheung, Joseph Gonzalez, and Joseph Hellerstein win awards at VLDB Conference 2023
CS Associate Professor Alvin Cheung has won the 2023 Very Large Data Bases (VLDB) Early Career Research Contribution Award. The award, which includes a $2,000 prize, recognizes researchers who have made a significant impact through a specific contribution to the field since completing their Ph.D.
August 21, 2023
Come see us at SOSP ’23!
Accepted papers by Micah Murray, Matei Zaharia, Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Lianmin Zheng, Joseph Gonzalez, Hao Zhang, Ion Stoica, and Emma Dauterman at the 29th ACM Symposium on Operating Systems Principles.
August 18, 2023
‘Unlocking the new next frontier’: UC Berkeley researchers develop innovative AI ‘Gorilla’
Researchers from the Sky Computing lab and the Berkeley AI Research, or BAIR, recently released Gorilla, a large language model, or LLM, designed to revolutionize the way AI algorithms function. The researchers behind Gorilla include Shishir Patil, Tianjun Zhang, Prof. Joseph Gonzalez, and Xin Wang.
July 7, 2023
Come see us at OSDI ’23!
Ion Stoica will be giving the Keynote Address on Sky Computing at the 17th USENIX Symposium on Operating Systems Design and Implementation. Emma Dauterman, Siyuan Zhuang, Audrey Cheng, Romil Bhardwaj, and Zhuohan Li will be presenting at the technical sessions.
April 20, 2023
Ion Stoica and Joseph Gonzalez receive AWS AI Amazon Research Award on “A Unified Platform for Training and Serving Large Models”
Amazon Research Awards (ARA) provides unrestricted funds and AWS Promotional Credits to academic researchers investigating various research topics in multiple disciplines. Awardees, who represent 54 universities in 14 countries, have access to Amazon public datasets, along with AWS AI/ML services and tools.
Events
December 13, 2023
Database Seminar: Nikolaos (Nikos) Tziavelis (Northeastern University) – “Efficient Ranked Access over Joins”
Join queries over multiple tables can produce a huge output that is infeasible to compute. Even when it is feasible, it is often not efficient when users have particular preferences over the answers in the output and are interested in accessing only a small subset according to that ranking; either t…
December 8, 2023
Sky Systems Seminar: Zhihao Jia (CMU) – Building Systems for Fast, Efficient, and Affordable Large Language Models
The high computational and memory requirements of large language models (LLMs) make it challenging to train and serve them cheaply and efficiently. For example, serving a LLAMA-2-70B on NVIDIA A100 GPUs can only utilize 2% of the available compute resources. In this talk, I will present two systems …
December 6, 2023
Sky Security Seminar: Alin Tomescu (Aptos Labs) – UTT: Sensibly-Anonymous Decentralized Payments without zkSNARKs
We present UTT, a system for decentralized e-cash with accountable privacy….
December 6, 2023
Database Seminar: Mira Mezini (Technical University Darmstadt) – “Programming Abstractions for Safe and Secure Local-FirstSoftware”
Today’s computing infrastructure is massively distributed across back-end geo-replicated clouds and millions of increasingly powerful front-end devices. …
Publications
October 2023
Efficient Memory Management for Large Language Model Serving with PagedAttention.
October 2023
The Story of GraphLab – From Scaling Machine Learning to Shaping Graph Systems Research.
October 2023
Multiversion Hindsight Logging for Continuous Training.
August 2023
Energy-based Predictive Representations for Partially Observed Reinforcement Learning.
August 2023
Optimizing the cloud? Don’t train models. Build oracles!
August 2023
Efficient Data Sharing across Trust Domains.
August 2023
HOLMES: Efficient Distribution Testing for Secure Collaborative Learning.
August 2023
Test Accuracy vs. Generalization Gap: Model Selection in NLP without Accessing Training or Testing Data.

Ion Stoica – Featured Projects
u003ch4u003eu003cstrongu003eSkyPilotu003c/strongu003eu003c/h4u003ernu003cspan style=u0022font-weight: 400;u0022u003eTo comply with the increasing number of government regulations about data placement and processing, and to protect themselves against major cloud outages, many users want the ability to easily migrate their workloads between clouds. We propose doing so not by imposing uniform and comprehensive standards, but by creating a fine-grained two-sided market via intercloud brokers. u003c/spanu003eu003ca href=u0022https://github.com/skypilot-org/skypilotu0022u003eu003cspan style=u0022font-weight: 400;u0022u003eSkyPilotu003c/spanu003eu003c/au003eu003cspan style=u0022font-weight: 400;u0022u003e is an intercloud broker that treats the cloud ecosystem not just as a collection of individual and largely incompatible clouds but as a more integrated Sky of Computing. SkyPilot enables users to run Machine Learning and Data Science batch jobs seamlessly on any cloud, reduce cloud costs substantially, tap into best-in-class hardware on different clouds, and enjoy higher resource availability.u003c/spanu003ernrnu0026nbsp;rnu003ch4u003eu003cstrongu003eSkyplaneu003c/strongu003eu003c/h4u003ernCloud applications are increasingly distributing data across multiple regions and cloud providers in response to privacy regulations, to take advantage of specialized hardware, and to prevent vendor lock-in. Unfortunately, wide-area bulk data transfers are often slow and expensive due to egress fees. This work aims to reduce both the latency and the cost of inter-cloud bulk transfer by using a variety of techniques, including overlay routing, multiple instances, multiple TCP connections, and taking advantage of different network tiers. Together, these techniques allow u003ca href=u0022https://skyplane.org/en/latest/u0022u003eSkyplaneu003c/au003e to significantly improve object transfer throughput and lower the costs.

Natacha Crooks – Featured Project
u003ch4u003eu003cb data-stringify-type=u0022boldu0022u003eBasilu003c/bu003eu003c/h4u003ernu003ca href=u0022https://arxiv.org/abs/2109.12443u0022u003eBasilu003c/au003e explores the design of SQL databases with high integrity and decentralized trust. How can traditional functionality like ACID transactions and SQL queries be u003cb data-stringify-type=u0022boldu0022u003eefficientlyu003c/bu003e implemented when trust is decentralized among n distinct parties, of which a subject can misbehave.

Joseph Gonzalez – Featured Project
u003ch4u003eu003cb data-stringify-type=u0022boldu0022u003eralf u003c/bu003eu003c/h4u003ernWe are exploring the design of feature stores: the emerging class of data systems that bridge model development, training, and inference. Features stores compute, store, and managing the data and derived features at the heart of ML powered applications. u003ca href=u0022https://github.com/feature-store/ralfu0022u003eRalfu003c/au003e is a feature store for rapidly changing data. Ralf incrementally propagates raw data changes to derived u003cemu003efeature tablesu003c/emu003e which are queryable by downstream applications such as model training and inference.

Raluca Ada Popa – Featured Project
u003ch4u003eu003cb data-stringify-type=u0022boldu0022u003eMCu003csupu003e2u003c/supu003eu003c/bu003eu003c/h4u003ernu003ca href=u0022https://mc2-project.github.io/u0022u003eMCu003csupu003e2u003c/supu003eu003c/au003e is a platform for running secure analytics and machine learning on encrypted data. With MCu003csupu003e2u003c/supu003e, organizations can safely upload their confidential data to the cloud in encrypted form and securely compute analytics and machine learning without exposing the unencrypted data to the cloud provider. MCu003csupu003e2u003c/supu003e also enables secure collaboration among multiple organizations, where the data owners can use the platform to jointly analyze their collective data without revealing their individual data to each other.

Koushik Sen – Featured Project
u003ch4u003eu003cb data-stringify-type=u0022boldu0022u003eFuzzFactoryu003c/bu003eu003c/h4u003ernu003ca href=u0022https://github.com/rohanpadhye/FuzzFactoryu0022u003eFuzzFactoryu003c/au003e is domain-specific fuzz testing tool that generalizes coverage-guided fuzzing to domain-specific testing goals. FuzzFactory allows users to guide the fuzzer’s search process without having to modify the core search algorithm.
Sky Computing Story
Berkeley’s computer science division has an ongoing tradition of 5-year collaborative research labs. Recent labs included the AMPLab (ended in 2016) and the RISELab. These labs have had significant impact in both academia and industry. Past labs publish their research at top conferences in systems, databases, and machine learning. On the industrial side, AMPLab and RISELab fostered several successful startups (Databricks, Opaque, Ponder, Anyscale, to name a few). We are excited to announce the Berkeley Sky Computing Lab where we will strike to make cloud computing a true commodity.rnu003ch4u003eContextu003c/h4u003ernThe Sky Computing Lab represents the next chapter of data-intensive systems research at Berkeley. Recent years have seen the explosion of cloud computing. Applications are moving their data and computation to the cloud; on-premise services are dying. In doing so, companies have to make difficult choices between the myriad of cloud providers, each with different services or hardware. Lock-in, whether through artificial migration costs, legal constraints or engineering baggage is real. In the Sky Computing Lab, we will leverage distributed systems, programming languages, security, and machine learning to decouple the services that a company wants to implement from the choice of a specific cloud. Much like the Internet today, cloud computing should be an undifferentiated commodity. Applications should run seamlessly on any or multiple clouds.

u003cdiv class=u0022story-right-columnu0022u003ernu003ch4u003eMissionu003c/h4u003ernOur mission in the Sky Computing Lab is to transform the cloud into an undifferentiated commodity and ease application burden. As in previous labs, we’re all in — working on everything from basic research to software development, all in the Berkeley tradition of open publication and open source software. Our founding team consists of experts in distributed systems, machine learning, security and programming languages. We’ll use this space to lay out our ideas and progress as we go.rnu003ch4u003eCommitment to Diversityu003c/h4u003ernSky Computing is guided by Berkeley’s Principles of Community and is committed to providing a safe and caring research environment for every member of our community. We believe that a diverse student body, faculty, and staff are essential to the open exchange of ideas that Sky Computing Lab is founded on.rnrnu003cspan style=u0022font-weight: 400;u0022u003eOur head is in the cloud. We are heading for the SKY. u003c/spanu003ernrnu003c/divu003e
News
post= WP_Post Object ( [ID] => 1607 [post_author] => 17 [post_date] => 2023-10-23 10:01:17 [post_date_gmt] => 2023-10-23 17:01:17 [post_content] =>“CS Associate Professor Alvin Cheung has won the 2023 Very Large Data Bases (VLDB) Early Career Research Contribution Award. The award, which includes a $2,000 prize, recognizes researchers who have made a significant impact through a specific contribution to the field since completing their Ph.D. Separately, a paper by CS Professors Joseph Gonzalez and Joseph Hellerstein, co-authored by Yucheng Low, Aapo Kyrola, Danny Bickson, and Carlos Guestrin, received the 2023 VLDB Test of Time Award. Their paper, “Distributed Graphlab: A framework for machine learning in the cloud,” was published at VLDB 2012. The authors were nominated for this award by the research community, and the winner was selected based on the paper’s impact through its consequent products and services, and follow-through research by the community. “
[post_title] => Professors Alvin Cheung, Joseph Gonzalez, and Joseph Hellerstein win awards at VLDB Conference 2023 [post_excerpt] => [post_status] => publish [comment_status] => closed [ping_status] => closed [post_password] => [post_name] => professors-alvin-cheung-joseph-gonzalez-and-joseph-hellerstein-win-awards-at-very-large-data-base-conference-2023 [to_ping] => [pinged] => [post_modified] => 2023-11-13 17:25:06 [post_modified_gmt] => 2023-11-14 01:25:06 [post_content_filtered] => [post_parent] => 0 [guid] => https://sky.cs.berkeley.edu/?post_type=news&p=1607 [menu_order] => 0 [post_type] => news [post_mime_type] => [comment_count] => 0 [filter] => raw )dateOctober 23, 2023
Professors Alvin Cheung, Joseph Gonzalez, and Joseph Hellerstein win awards at VLDB Conference 2023
CS Associate Professor Alvin Cheung has won the 2023 Very Large Data Bases (VLDB) Early Career Research Contribution Award. The award, which includes a $2,000 prize, recognizes researchers who have made a significant impact through a specific contribution to the field since completing their Ph.D.
post= WP_Post Object ( [ID] => 1503 [post_author] => 17 [post_date] => 2023-08-21 11:17:58 [post_date_gmt] => 2023-08-21 18:17:58 [post_content] =>Accepted papers by Micah Murray, Matei Zaharia, Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Lianmin Zheng, Joseph Gonzalez, Hao Zhang, Ion Stoica, and Emma Dauterman.
See the SOSP website here.
Cornflakes: Zero-Copy Serialization for Microsecond-Scale Networking by Deepti Raghavan (Stanford University), Shreya Ravi (Stanford University), Gina Yuan (Stanford University), Pratiksha Thaker (Carnegie Mellon University), Sanjari Srivastava (Stanford University), Micah Murray (UC Berkeley), Pedro Henrique Penna (Microsoft Research), Amy Ousterhout (UC San Diego), Philip Levis (Stanford University and Google), Matei Zaharia (UC Berkeley) and Irene Zhang (Microsoft Research)
Efficient Memory Management for Large Language Model Serving with PagedAttention by Woosuk Kwon (UC Berkeley), Zhuohan Li (UC Berkeley), Siyuan Zhuang (UC Berkeley), Ying Sheng (Stanford University), Lianmin Zheng (UC Berkeley), Cody Hao Yu (Independent Researcher), Joseph Gonzalez (UC Berkeley), Hao Zhang (UC Berkeley and UC San Diego) and Ion Stoica (UC Berkeley)
Private Web Search with Tiptoe by Alexandra Henzinger (MIT), Emma Dauterman (UC Berkeley), Henry Corrigan-Gibbs (MIT) and Nickolai Zeldovich (MIT)
dateAugust 21, 2023
Come see us at SOSP ’23!
Accepted papers by Micah Murray, Matei Zaharia, Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Lianmin Zheng, Joseph Gonzalez, Hao Zhang, Ion Stoica, and Emma Dauterman at the 29th ACM Symposium on Operating Systems Principles.
post= WP_Post Object ( [ID] => 1500 [post_author] => 17 [post_date] => 2023-08-18 11:14:10 [post_date_gmt] => 2023-08-18 18:14:10 [post_content] =>Read the Daily Californian’s article here.
[post_title] => ‘Unlocking the new next frontier’: UC Berkeley researchers develop innovative AI ‘Gorilla’ [post_excerpt] => [post_status] => publish [comment_status] => closed [ping_status] => closed [post_password] => [post_name] => unlocking-the-new-next-frontier-uc-berkeley-researchers-develop-innovative-ai-gorilla [to_ping] => [pinged] => [post_modified] => 2023-08-24 10:12:54 [post_modified_gmt] => 2023-08-24 17:12:54 [post_content_filtered] => [post_parent] => 0 [guid] => https://sky.cs.berkeley.edu/?post_type=news&p=1500 [menu_order] => 0 [post_type] => news [post_mime_type] => [comment_count] => 0 [filter] => raw )dateAugust 18, 2023
‘Unlocking the new next frontier’: UC Berkeley researchers develop innovative AI ‘Gorilla’
Researchers from the Sky Computing lab and the Berkeley AI Research, or BAIR, recently released Gorilla, a large language model, or LLM, designed to revolutionize the way AI algorithms function. The researchers behind Gorilla include Shishir Patil, Tianjun Zhang, Prof. Joseph Gonzalez, and Xin Wang.