Projects

Arena Hard

Arena Hard

An Automatic Pipeline to Build High-Quality LLM Benchmarks with High Separability and Agreement to Human Preference from Live Data
Auto-Whittaker

Auto-Whittaker

Automatically Rewriting Distributed Protocols for Scalability
Berkeley Function-Calling Leaderboard

Berkeley Function-Calling Leaderboard

Function-Calling Capability of Different LLMs
Chatbot Arena

Chatbot Arena

An Open Platform for Evaluating LLMs by Human Preference
Compass

Compass

Encrypted Semantic Search with High Accuracy
DSPy

DSPy

The Framework For Programming—Not Prompting—Foundation Models
Embarcadero

Embarcadero

A Totally Ordered, High Throughput, Pub/Sub System with Disaggregated Memory
Flock

Flock

A Framework for Deploying On-Demand Distributed Trust
GoEx

GoEx

A Runtime for LLM-Generated Actions like Code, API Calls, and More.
Gorilla

Gorilla

Gorilla is an open-source, state-of-the-art LLM that invokes API calls to interact with services!
Gorilla OpenFunctions

Gorilla OpenFunctions

Elevating LLM Function Calling with Versatile API Integration
Hydro

Hydro

Pioneering Cloud-Native Programming for Scalable Distributed Applications
LiveCodeBench

LiveCodeBench

Holistic and Contamination Free Evaluation of Large Language Models for Code
LOTUS

LOTUS

Easily Build Knowledge-Intensive LLM Applications That Reason Over Your Data With LOTUS!
MemGPT

MemGPT

Teach LLMs to manage their own memory for unbounded context!
POET

POET

Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging
R2E

R2E

A Dynamic Framework for Evaluating AI Coding Systems
RAFT

RAFT

“Retrieval-Augmented Fine-Tuning” combines the benefits of Retrieval-Augmented Generation and Fine-Tuning for better domain adaptation
Rollbaccine

Rollbaccine

A General Solution to Rollback Attacks in TEEs

RouteLLM

A Framework for Serving and Evaluating LLM Routers – Save LLM Costs Without Compromising Quality!

S-LoRA

Serving Thousands of Concurrent LoRA Adapters

Scrooge

Enabling Replicated State Machines to Communicate Efficiently
SGLang

SGLang

A Fast Serving Framework For Large Language Models and Vision Language Models
Skydentity

Skydentity

Let orchestrators run your workloads on your cloud resources without handing over your cloud credentials and data.

SkyPIE

A Fast & Accurate Oracle for Object Placement
SkyPilot

SkyPilot

SkyPilot is a framework for running LLMs, AI, and batch jobs on any infrastructure, offering maximum cost savings, highest GPU availability, and managed execution.
Skyplane

Skyplane

Blazing Fast Bulk Data Transfers Between Any Cloud

Stylus

We introduce Stylus, which efficiently selects and automatically composes task-specific adapters based on a prompt’s keywords.

SVR3

The SVR3 project aims to store client-side secrets server-side protected by a human-remembered (and thus, low-entropy) pin.
Vicuna

Vicuna

An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality