Sky Seminar Series: Adrian Cockcroft (AWS) – Thanks for the Memory – LLM training as architectures change

Speaker: Adrian Cockcroft

Location: Soda 510

Date: March 22, 2024

Time: 12-1pm PST

Title: Thanks for the Memory – LLM training as architectures change

Abstract:

I’m interested in the workload that LLM training presents and have figured out a bit about what it looks like and how people optimize with current generation (H100) systems. The new model (GH200 and GB200) is using much larger single system memory domains in the hundreds of Terabytes and that is bringing some new challenges. Attending NVIDIA GTC for the first time means I also have some new perspectives. I’ve been thinking about more general use cases for very large memory systems for many years, which I call the Petalith architecture pattern. 

Bio: Adrian Cockcroft is a technologist and strategist with broad experience from the bits to the boardroom, in both enterprise and consumer-oriented businesses, from startups to some of the largest companies in the world, equally at home with hardware and software, development and operations. He’s best known as the cloud architect for Netflix during their trailblazing migration to AWS and was a very early practitioner and advocate of DevOps, microservices, and chaos engineering, helping bring these concepts to the wider audience they have today. He spent the last few years as a VP at Amazon deeply immersed in the dual challenges of helping Amazon itself – one of the largest companies in the world – become more sustainable, and via AWS – one of the largest technology suppliers in the world – helping its enterprise and public sector customers become more sustainable. Adrian has a BSc in Applied Physics and Electronics from The City University, London, UK.