Dissertation Talk: Scaling Environments and Verifiers for Software Engineering Agents – Manish Shetty

Title: Scaling Environments and Verifiers for Software Engineering Agents
Speaker: Manish Shetty
Advisor: Koushik Sen

Date: Wednesday, March 4
Time: 3:30-4:30 PM PT

This is a hybrid event held in person and virtually over Zoom.
Location (In-person): Soda 373

Abstract:  Modern language models, trained primarily on static code corpora, are increasingly capable at writing code, yet struggle with real-world software engineering over long horizons. Software, however, is one of the few domains where AI systems can interact with reality cheaply: environments instantiate executable tasks and verifiers (compilers, tests, program analyzers) turn execution outcomes into learning signals. How do we scale this interaction and feedback to build better agents and understand their capabilities? In this talk, I will present methods for synthesizing executable environments and verifiers for agents across the software lifecycle. I’ll start with R2E, a framework for turning a codebase into an agent environment for code generation, and show how execution feedback can transform model performance. Next, I’ll describe how these environments can be scaled synthetically for training bug-fixing agents, and what we learned about the limits of verification quality along the way. Finally, I’ll present two challenging environments, long-horizon C-to-Rust translation and code optimization, where strong verifiers both unlock agent capabilities and reveal their frontiers.