PyTorch Conference 2024: Full Schedule

September 18-19, 2024
San Francisco, California
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC-7). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

4:00pm PDT

[HALIDE] A Halide Backend for TorchInductor - Jason Ansel, Meta

Wednesday September 18, 2024 4:00pm - 4:10pm PDT

Festival Pavilion - Breakout Room B

This talk will focus on a new Halide backend for TorchInductor, which is in addition to the existing Triton and C++ backends. The Halide backend is meant to serve as a reference backend to make it easier to extend TorchInductor to support new backend compilers and hardware devices. Halide has been the inspiration (either in ideas or through forking) of numerous other compiler projects, so it is a good starting point for adding new backends that follow a Halide-like model.

Speakers

Jason Ansel

Research Scientist, Meta

Jason Ansel is a Research Scientist at Meta AI and a technical lead for PyTorch compilers. He started the TorchDynamo and TorchInductor projects, which bring flexible graph capture and a high performance compiler to PyTorch 2. He received a Ph.D. from MIT CSAIL in 2014 with research... Read More →

PT2 + Halide @ PTC 2024 pdf

Wednesday September 18, 2024 4:00pm - 4:10pm PDT
Festival Pavilion - Breakout Room B

DL Compiler Mini-Summit

Slides Attached Yes

4:10pm PDT

[MLIR] Enabling Composition of Kernels and Compilers - Jacques Pienaar, Google

Wednesday September 18, 2024 4:10pm - 4:20pm PDT

Festival Pavilion - Breakout Room B

Hand written kernels and compilers have been part of the toolbox to provide efficient and broad coverage. These approaches have often been positioned as being at odds with one another - and indeed the software solutions either side have sometimes made it such. MLIR, since inception, aimed to enable general, beneficial composition instead. Rather than treating kernels as a black box escape hatch, treat it as a peer in solving the serving needs. This is not magic and requires consideration of how best to combine. In this talk I'll present the approach and effect of this both in IREE and OpenXLA.

Speakers

Jacques Pienaar

SWE, Google

Jacques Pienaar is a lead of the ML Compiler Systems Research team at Google Deepmind. In this role he focuses on accelerating and simplifying machine learning for high-performance model deployment across various architectures. He is one of the founders of MLIR, a founding member... Read More →

Enabling Composition of Kernels and Compilers pdf

Wednesday September 18, 2024 4:10pm - 4:20pm PDT
Festival Pavilion - Breakout Room B

DL Compiler Mini-Summit

4:20pm PDT

[TRITON] Maximizing Kernel Development Productivity Under Performance Constraints - Philip Tillet, OpenAI

Wednesday September 18, 2024 4:20pm - 4:30pm PDT

Festival Pavilion - Breakout Room B

Machine Learning research workflows are often bottlenecked by the development of compute kernels for new algorithms and GPU architectures. This process can be daunting, and often requires a careful trade-off between productivity and performance. In this talk, we will discuss how Triton -- a mid-level programming language for kernel development -- approaches this multi-objective optimization problem, and the design decisions that were made to that effect.

Speakers

Phil Tillet

Member Of Technical Staff, OpenAI

Phil first began working with GPUs in 2011 as a contributor to the ViennaCL library. He then received his B.S. from Telecom SudParis (France) in 2012, his M.S. from NCTU (Taiwan) in 2014, and his Ph.D. from Harvard University in 2020. He joined OpenAI full time in 2020 to pursue his... Read More →

Wednesday September 18, 2024 4:20pm - 4:30pm PDT
Festival Pavilion - Breakout Room B

DL Compiler Mini-Summit

4:30pm PDT

[TVM] Universally Deploy Large-language Models via ML Compilation - Tianqi Chen, CMU & OctoAI

Wednesday September 18, 2024 4:30pm - 4:40pm PDT

Festival Pavilion - Breakout Room B

Deploying deep learning models on various devices has become an important topic. Machine learning compilation is an emerging field that leverages compiler and automatic search techniques to accelerate AI models. ML compilation brings a unique set of challenges: emerging machine learning models; increasing hardware specialization brings a diverse set of acceleration primitives; growing tension between flexibility and performance. In this talk. I then discuss our experience in bringing foundational models to a variety of devices and hardware environments through machine learning compilation.

Speakers

Tianqi Chen

Assistant Professor, CMU

Tianqi Chen is currently an Assistant Professor at the Machine Learning Department and Computer Science Department of Carnegie Mellon University. He is also the Chief Technologist of OctoAI. He received his PhD. from the Paul G. Allen School of Computer Science & Engineering at the... Read More →

Wednesday September 18, 2024 4:30pm - 4:40pm PDT
Festival Pavilion - Breakout Room B

DL Compiler Mini-Summit

4:40pm PDT

[MOJO] Lifting PT to New Heights with MAX and Mojo - Mikhail Zolotukhin, Modular

Wednesday September 18, 2024 4:40pm - 4:50pm PDT

Festival Pavilion - Breakout Room B

In this talk we'll peek into Modular's inference engine: how it builds on and works with PyTorch and what is unique about it. We will look into how Mojo language can be used to define performant kernels and what optimizations the inference engine can perform. We will also talk briefly about our experience of developing a third party backend for torch.compile.

Speakers

Mikhail Zolotukhin

Software Engineering Manager, Modular

Mikhail is an open source enthusiast with contributions ranging from GCC and LLVM to PyTorch. Currently he is at Modular leading a team working on integration of Modular's inference stack with PyTorch.

[MOJO] Lifting PT to New Heights with MAX and Mojo pdf

Wednesday September 18, 2024 4:40pm - 4:50pm PDT
Festival Pavilion - Breakout Room B

DL Compiler Mini-Summit

4:50pm PDT

Together Goes Brrr: Threading Research & Production with Torch Compile - Pragaash Ponnusamy, together.ai

Wednesday September 18, 2024 4:50pm - 5:00pm PDT

Festival Pavilion - Breakout Room B

The deployment of large language models for inference at scale is inherently complex, often requiring intricate optimizations across compute-bound and memory-bound regimes. This talk explores how PyTorch's torch.compile has revolutionized the optimization landscape for LLM serving at Together AI. Through its sophisticated Dynamo tracer and Inductor backend, torch.compile has transformed the approach to critical performance bottlenecks in both prefill and decode phases of inference. We examine how automatic vertical fusion, epilogue optimization, and adaptive kernel generation across batch sizes for GEMV and GEMM workloads, addressing key efficiency concerns, from CUDA graph captures and optimized all-reduce strategies to custom kernel registrations. The presentation highlights Together AI's journey in leveraging torch.compile to streamline the transition from research to production, significantly simplifying the deployment process for even custom architectures. By automating many performance-critical optimizations, torch.compile has not only enhanced inference efficiency but also democratized high-performance LLM deployment. We'll conclude by sharing key lessons learned and best practices gleaned from Together AI's experience in deploying torch.compile to production, serving billions of user queries and navigating the complexities of large-scale LLM inference.

Speakers

Pragaash Ponnusamy

Senior Staff AI/ML Researcher, Together AI

Wednesday September 18, 2024 4:50pm - 5:00pm PDT
Festival Pavilion - Breakout Room B

DL Compiler Mini-Summit

5:00pm PDT

DL Compiler Panel Discussion - Philip Tillet, OpenAI; Jason Ansel, Meta; Jacques Pienaar, Google; Tianqi Chen, CMU & OctoAI; Mikhail Zolotukhin, Modular; Peng Wu, Meta

Wednesday September 18, 2024 5:00pm - 5:30pm PDT

Festival Pavilion - Breakout Room B

Since the release of PyTorch 2 in 2023, torch.compile() has spurred significant new thinking around DL compiler designs at the framework level. In this session, we invite leaders in this space to share their insights based on real experiences of building DL compilers – Triton, TorchInductor, Halide, TVM, OpenXLA, and Mojo – and growing their ecosystems. We also invite a ‘compiler user representative,’ together.ai, to share their recent journey of redesigning the LLM inference stack around torch.compile(). Each leader will give a 10-minute lightning talk and an engaging panel discussion.

Speakers

Peng Wu

Engineering Manager, Meta

Dr. Peng Wu is the engineering manager of the PyTorch Compiler team at Meta. Dr. Wu spent over a decade at IBM research, working on many aspects of programming systems. She then founded the Programming Technologies Lab at Huawei and led its growth for six years. At Meta, she... Read More →

DL Compiler Mini-Summit