Sky Computing

Towards Utility Computing for the Cloud

The Sky Above
the Clouds

Read the White Paper

Video: Sky Computing

Watch

DARE: Diversifying Access to Research in Engineering

Learn More

News

March 6, 2024

Natacha Crooks wins IEEE TCDE Rising Star Award

Professor Natacha Crooks awarded the IEEE TCDE Rising Star Award for contributions to distributed data management, and its applications to blockchain technology, security, and cloud computing.

February 20, 2024

Prof. Matei Zaharia and his students and collaborators talk about compound AI systems and their research on them

AI caught everyone’s attention in 2023 with Large Language Models (LLMs) that can be instructed to perform general tasks, such as translation or coding, just by prompting. This naturally led to an intense focus on models as the primary ingredient in AI application development, with everyone wondering what capabilities new LLMs will bring. As more developers begin to build using LLMs, however, we believe that this focus is rapidly changing: state-of-the-art AI results are increasingly obtained by compound systems with multiple components, not just monolithic models.

February 7, 2024

Ion Stoica elected into the National Academy of Engineering

The National Academy of Engineering (NAE) announced today that three UC Berkeley faculty members — Arpad Horvath, Ravi Prasher and Ion Stoica — have been elected to its ranks.

February 6, 2024

Salk Institute scientists scale brain research on Google Cloud with SkyPilot

The cloud, with its large numbers of the latest processors and scalable storage systems, is becoming indispensable to modern biomedical research organizations, who use it to generate and analyze vast amounts of data. However, research by its very nature is not a linear process, and so cloud resources must be flexible enough to rapidly scale up and down in response to changing demands. Additionally, research always has limited funding, and so cost- and time-efficiency are critical to achieving research insights.

October 23, 2023

Professors Alvin Cheung, Joseph Gonzalez, and Joseph Hellerstein win awards at VLDB Conference 2023

CS Associate Professor Alvin Cheung has won the 2023 Very Large Data Bases (VLDB) Early Career Research Contribution Award. The award, which includes a $2,000 prize, recognizes researchers who have made a significant impact through a specific contribution to the field since completing their Ph.D.

Events

April 26, 2024

Sky Seminar Series: Ken Goldberg (UC Berkeley) – Big Robot Data: Streamlining Data Collection for Large Robot Action Models

The Robot Transformer (RT-X) project, led by Google with 21 University partners, has an ambitious goal: creating Generative Pretrained Transformers (GPT) aka Large Robot Action Models for robotics: https://robotics-transformer-x.github.io/Collaborators are sharing terabytes of video and robot comma…

April 19, 2024

Sky Seminar Series: Raghu Ramakrishnan (Microsoft) – Open Lakes, Converged Platforms: Data Foundations for the Era of AI

Web companies were at the forefront of observing core processes (search ranking, ad placement, personalized browsing) through telemetry, and optimizing these processes through the lens of that data. This, together with the emergence of cloud computing, led to a revolution in scale-out data platforms…

April 12, 2024

Sky Seminar Series: Abhinav Venigalla (Databricks) – Systems and Optimizations for training MoEs at Scale

DBRX is a new Mixture-of-Experts (MoE) open model trained from scratch in ~3 months by the Databricks Mosaic research team. This talk will cover how DBRX was built using both open-source tools (StreamingDataset, Composer, Megablocks, etc.) and internal Databricks infrastructure. It will cover system…

March 22, 2024

Sky Seminar Series: Adrian Cockcroft (AWS) – Thanks for the Memory – LLM training as architectures change

I’m interested in the workload that LLM training presents and have figured out a bit about what it looks like and how people optimize with current generation (H100) systems. The new model (GH200 and GB200) is using much larger single system memory domains in the hundreds of Terabytes and that is b…

Past Events

Publications

April 2024

ZKML: An Optimizing System for ML Inference in Zero-Knowledge Proofs.

April 2024

Can’t Be Late: Optimizing Spot Instance Savings under Deadlines.

April 2024

Cloudcast: High-Throughput, Cost-Aware Overlay Multicast in the Cloud.

April 2024

Composing MPC With LQR and Neural Network for Amortized Efficiency and Stable Control.

April 2024

Optimizing the cloud? Don’t train models. Build oracles!

March 2024

RAFT: Adapting Language Model to Domain Specific RAG.

March 2024

depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers.

March 2024

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code.

Ion Stoica – Featured Projects

SkyPilot

To comply with the increasing number of government regulations about data placement and processing, and to protect themselves against major cloud outages, many users want the ability to easily migrate their workloads between clouds. We propose doing so not by imposing uniform and comprehensive standards, but by creating a fine-grained two-sided market via intercloud brokers. SkyPilot is an intercloud broker that treats the cloud ecosystem not just as a collection of individual and largely incompatible clouds but as a more integrated Sky of Computing. SkyPilot enables users to run Machine Learning and Data Science batch jobs seamlessly on any cloud, reduce cloud costs substantially, tap into best-in-class hardware on different clouds, and enjoy higher resource availability.

Skyplane

Cloud applications are increasingly distributing data across multiple regions and cloud providers in response to privacy regulations, to take advantage of specialized hardware, and to prevent vendor lock-in. Unfortunately, wide-area bulk data transfers are often slow and expensive due to egress fees. This work aims to reduce both the latency and the cost of inter-cloud bulk transfer by using a variety of techniques, including overlay routing, multiple instances, multiple TCP connections, and taking advantage of different network tiers. Together, these techniques allow Skyplane to significantly improve object transfer throughput and lower the costs.

Natacha Crooks – Featured Project

Basil

Basil explores the design of SQL databases with high integrity and decentralized trust. How can traditional functionality like ACID transactions and SQL queries be efficiently implemented when trust is decentralized among n distinct parties, of which a subject can misbehave.

Joseph Gonzalez – Featured Project

Gorilla

The Gorilla project is designed to connect large language models (LLMs) with a wide range of services and applications exposed through APIs. Imagine if ChatGPT could interact with thousands of services, ranging from Instagram and Doordash to tools like Google Calendar and Stripe, to help you accomplish tasks. This may be how we interact with computers and even the web in the future. Gorilla is an LLM that we train using a concept we call retriever-aware training (RAT), which picks the right API to perform a task that a user can specify in natural language. Gorilla also introduces an Abstract Syntax Tree (AST) based sub-tree matching algorithm, which for the first time allows us to measure hallucination of LLMs!

Raluca Ada Popa – Featured Project

MC²

MC² is a platform for running secure analytics and machine learning on encrypted data. With MC², organizations can safely upload their confidential data to the cloud in encrypted form and securely compute analytics and machine learning without exposing the unencrypted data to the cloud provider. MC² also enables secure collaboration among multiple organizations, where the data owners can use the platform to jointly analyze their collective data without revealing their individual data to each other.

Koushik Sen – Featured Project

FuzzFactory

FuzzFactory is domain-specific fuzz testing tool that generalizes coverage-guided fuzzing to domain-specific testing goals. FuzzFactory allows users to guide the fuzzer’s search process without having to modify the core search algorithm.

Sky Computing Story

Berkeley’s computer science division has an ongoing tradition of 5-year collaborative research labs. Recent labs included the AMPLab (ended in 2016) and the RISELab. These labs have had significant impact in both academia and industry. Past labs publish their research at top conferences in systems, databases, and machine learning. On the industrial side, AMPLab and RISELab fostered several successful startups (Databricks, Opaque, Ponder, Anyscale, to name a few). We are excited to announce the Berkeley Sky Computing Lab where we will strike to make cloud computing a true commodity.

Context

The Sky Computing Lab represents the next chapter of data-intensive systems research at Berkeley. Recent years have seen the explosion of cloud computing. Applications are moving their data and computation to the cloud; on-premise services are dying. In doing so, companies have to make difficult choices between the myriad of cloud providers, each with different services or hardware. Lock-in, whether through artificial migration costs, legal constraints or engineering baggage is real. In the Sky Computing Lab, we will leverage distributed systems, programming languages, security, and machine learning to decouple the services that a company wants to implement from the choice of a specific cloud. Much like the Internet today, cloud computing should be an undifferentiated commodity. Applications should run seamlessly on any or multiple clouds.

Sketch drawing of a lightbulb and paper airplane

Mission

Our mission in the Sky Computing Lab is to transform the cloud into an undifferentiated commodity and ease application burden. As in previous labs, we’re all in — working on everything from basic research to software development, all in the Berkeley tradition of open publication and open source software. Our founding team consists of experts in distributed systems, machine learning, security and programming languages. We’ll use this space to lay out our ideas and progress as we go.

Commitment to Diversity

Sky Computing is guided by Berkeley’s Principles of Community and is committed to providing a safe and caring research environment for every member of our community. We believe that a diverse student body, faculty, and staff are essential to the open exchange of ideas that Sky Computing Lab is founded on.

Our head is in the cloud. We are heading for the SKY.

Sky Computing

Towards Utility Computing for the Cloud

The Sky Abovethe Clouds

Video: Sky Computing

DARE: Diversifying Access to Research in Engineering

News

March 6, 2024

Natacha Crooks wins IEEE TCDE Rising Star Award

February 20, 2024

Prof. Matei Zaharia and his students and collaborators talk about compound AI systems and their research on them

February 7, 2024

Ion Stoica elected into the National Academy of Engineering

February 6, 2024

Salk Institute scientists scale brain research on Google Cloud with SkyPilot

October 23, 2023

Professors Alvin Cheung, Joseph Gonzalez, and Joseph Hellerstein win awards at VLDB Conference 2023

Events

April 26, 2024

Sky Seminar Series: Ken Goldberg (UC Berkeley) – Big Robot Data: Streamlining Data Collection for Large Robot Action Models

April 19, 2024

Sky Seminar Series: Raghu Ramakrishnan (Microsoft) – Open Lakes, Converged Platforms: Data Foundations for the Era of AI

April 12, 2024

Sky Seminar Series: Abhinav Venigalla (Databricks) – Systems and Optimizations for training MoEs at Scale

March 22, 2024

Sky Seminar Series: Adrian Cockcroft (AWS) – Thanks for the Memory – LLM training as architectures change

Publications

April 2024

ZKML: An Optimizing System for ML Inference in Zero-Knowledge Proofs.

April 2024

Can’t Be Late: Optimizing Spot Instance Savings under Deadlines.

April 2024

Cloudcast: High-Throughput, Cost-Aware Overlay Multicast in the Cloud.

April 2024

Composing MPC With LQR and Neural Network for Amortized Efficiency and Stable Control.

April 2024

Optimizing the cloud? Don’t train models. Build oracles!

March 2024

RAFT: Adapting Language Model to Domain Specific RAG.

March 2024

depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers.

March 2024

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code.

Ion Stoica – Featured Projects

SkyPilot

Skyplane

Natacha Crooks – Featured Project

Basil

Joseph Gonzalez – Featured Project

Gorilla

Raluca Ada Popa – Featured Project

MC2

Koushik Sen – Featured Project

FuzzFactory

Sky Computing Story

Context

Mission

Commitment to Diversity

Sponsors

Affiliated Companies

The Sky Above
the Clouds

MC²