Loading…
Attending this event?
September 18-19, 2024
San Francisco, California
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC-7). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

DL Compiler Mini-Summit clear filter
Wednesday, September 18
 

4:00pm PDT

[HALIDE] A Halide Backend for TorchInductor - Jason Ansel, Meta
Wednesday September 18, 2024 4:00pm - 4:10pm PDT
This talk will focus on a new Halide backend for TorchInductor, which is in addition to the existing Triton and C++ backends.  The Halide backend is meant to serve as a reference backend to make it easier to extend TorchInductor to support new backend compilers and hardware devices.  Halide has been the inspiration (either in ideas or through forking) of numerous other compiler projects, so it is a good starting point for adding new backends that follow a Halide-like model.
Speakers
JA

Jason Ansel

Research Scientist, Meta
Jason Ansel is a Research Scientist at Meta AI and a technical lead for PyTorch compilers. He started the TorchDynamo and TorchInductor projects, which bring flexible graph capture and a high performance compiler to PyTorch 2. He received a Ph.D. from MIT CSAIL in 2014 with research... Read More →
Wednesday September 18, 2024 4:00pm - 4:10pm PDT
Festival Pavilion - Breakout Room B

4:10pm PDT

[MLIR] Enabling Composition of Kernels and Compilers - Jacques Pienaar, Google
Wednesday September 18, 2024 4:10pm - 4:20pm PDT
Hand written kernels and compilers have been part of the toolbox to provide efficient and broad coverage. These approaches have often been positioned as being at odds with one another - and indeed the software solutions either side have sometimes made it such. MLIR, since inception, aimed to enable general, beneficial composition instead. Rather than treating kernels as a black box escape hatch, treat it as a peer in solving the serving needs. This is not magic and requires consideration of how best to combine. In this talk I'll present the approach and effect of this both in IREE and OpenXLA.
Speakers
avatar for Jacques Pienaar

Jacques Pienaar

SWE, Google
Jacques Pienaar is a lead of the ML Compiler Systems Research team at Google Deepmind. In this role he focuses on accelerating and simplifying machine learning for high-performance model deployment across various architectures. He is one of the founders of MLIR, a founding member... Read More →
Wednesday September 18, 2024 4:10pm - 4:20pm PDT
Festival Pavilion - Breakout Room B

4:20pm PDT

[TRITON] Maximizing Kernel Development Productivity Under Performance Constraints - Philip Tillet, OpenAI
Wednesday September 18, 2024 4:20pm - 4:30pm PDT
Machine Learning research workflows are often bottlenecked by the development of compute kernels for new algorithms and GPU architectures. This process can be daunting, and often requires a careful trade-off between productivity and performance. In this talk, we will discuss how Triton -- a mid-level programming language for kernel development -- approaches this multi-objective optimization problem, and the design decisions that were made to that effect.
Speakers
PT

Phil Tillet

Member Of Technical Staff, OpenAI
Phil first began working with GPUs in 2011 as a contributor to the ViennaCL library. He then received his B.S. from Telecom SudParis (France) in 2012, his M.S. from NCTU (Taiwan) in 2014, and his Ph.D. from Harvard University in 2020. He joined OpenAI full time in 2020 to pursue his... Read More →
Wednesday September 18, 2024 4:20pm - 4:30pm PDT
Festival Pavilion - Breakout Room B

4:30pm PDT

[TVM] Universally Deploy Large-language Models via ML Compilation - Tianqi Chen, CMU & OctoAI
Wednesday September 18, 2024 4:30pm - 4:40pm PDT
Deploying deep learning models on various devices has become an important topic. Machine learning compilation is an emerging field that leverages compiler and automatic search techniques to accelerate AI models. ML compilation brings a unique set of challenges: emerging machine learning models; increasing hardware specialization brings a diverse set of acceleration primitives; growing tension between flexibility and performance. In this talk. I then discuss our experience in bringing foundational models to a variety of devices and hardware environments through machine learning compilation.
Speakers
TC

Tianqi Chen

Assistant Professor, CMU
Tianqi Chen is currently an Assistant Professor at the Machine Learning Department and Computer Science Department of Carnegie Mellon University. He is also the Chief Technologist of OctoAI. He received his PhD. from the Paul G. Allen School of Computer Science & Engineering at the... Read More →
Wednesday September 18, 2024 4:30pm - 4:40pm PDT
Festival Pavilion - Breakout Room B

4:40pm PDT

[MOJO] Lifting PT to New Heights with MAX and Mojo - Mikhail Zolotukhin, Modular
Wednesday September 18, 2024 4:40pm - 4:50pm PDT
In this talk we'll peek into Modular's inference engine: how it builds on and works with PyTorch and what is unique about it. We will look into how Mojo language can be used to define performant kernels and what optimizations the inference engine can perform. We will also talk briefly about our experience of developing a third party backend for torch.compile.
Speakers
avatar for Mikhail Zolotukhin

Mikhail Zolotukhin

Software Engineering Manager, Modular
Mikhail is an open source enthusiast with contributions ranging from GCC and LLVM to PyTorch. Currently he is at Modular leading a team working on integration of Modular's inference stack with PyTorch.
Wednesday September 18, 2024 4:40pm - 4:50pm PDT
Festival Pavilion - Breakout Room B

4:50pm PDT

Together Goes Brrr: Threading Research & Production with Torch Compile - Pragaash Ponnusamy, together.ai
Wednesday September 18, 2024 4:50pm - 5:00pm PDT
The deployment of large language models for inference at scale is inherently complex, often requiring intricate optimizations across compute-bound and memory-bound regimes. This talk explores how PyTorch's torch.compile has revolutionized the optimization landscape for LLM serving at Together AI. Through its sophisticated Dynamo tracer and Inductor backend, torch.compile has transformed the approach to critical performance bottlenecks in both prefill and decode phases of inference. We examine how automatic vertical fusion, epilogue optimization, and adaptive kernel generation across batch sizes for GEMV and GEMM workloads, addressing key efficiency concerns, from CUDA graph captures and optimized all-reduce strategies to custom kernel registrations. The presentation highlights Together AI's journey in leveraging torch.compile to streamline the transition from research to production, significantly simplifying the deployment process for even custom architectures. By automating many performance-critical optimizations, torch.compile has not only enhanced inference efficiency but also democratized high-performance LLM deployment. We'll conclude by sharing key lessons learned and best practices gleaned from Together AI's experience in deploying torch.compile to production, serving billions of user queries and navigating the complexities of large-scale LLM inference.
Speakers
PP

Pragaash Ponnusamy

Senior Staff AI/ML Researcher, Together AI
Wednesday September 18, 2024 4:50pm - 5:00pm PDT
Festival Pavilion - Breakout Room B

5:00pm PDT

DL Compiler Panel Discussion - Philip Tillet, OpenAI; Jason Ansel, Meta; Jacques Pienaar, Google; Tianqi Chen, CMU & OctoAI; Mikhail Zolotukhin, Modular; Peng Wu, Meta
Wednesday September 18, 2024 5:00pm - 5:30pm PDT
Since the release of PyTorch 2 in 2023, torch.compile() has spurred significant new thinking around DL compiler designs at the framework level. In this session, we invite leaders in this space to share their insights based on real experiences of building DL compilers – Triton, TorchInductor, Halide, TVM, OpenXLA, and Mojo – and growing their ecosystems. We also invite a ‘compiler user representative,’ together.ai, to share their recent journey of redesigning the LLM inference stack around torch.compile(). Each leader will give a 10-minute lightning talk and an engaging panel discussion.
Speakers
avatar for Peng Wu

Peng Wu

Engineering Manager, Meta
Dr. Peng Wu is the engineering manager of the PyTorch Compiler team at Meta.  Dr. Wu spent over a decade at IBM research, working on many aspects of programming systems.  She then founded the Programming Technologies Lab at Huawei and led its growth for six years.  At Meta, she... Read More →
PT

Phil Tillet

Member Of Technical Staff, OpenAI
Phil first began working with GPUs in 2011 as a contributor to the ViennaCL library. He then received his B.S. from Telecom SudParis (France) in 2012, his M.S. from NCTU (Taiwan) in 2014, and his Ph.D. from Harvard University in 2020. He joined OpenAI full time in 2020 to pursue his... Read More →
avatar for Mikhail Zolotukhin

Mikhail Zolotukhin

Software Engineering Manager, Modular
Mikhail is an open source enthusiast with contributions ranging from GCC and LLVM to PyTorch. Currently he is at Modular leading a team working on integration of Modular's inference stack with PyTorch.
TC

Tianqi Chen

Assistant Professor, CMU
Tianqi Chen is currently an Assistant Professor at the Machine Learning Department and Computer Science Department of Carnegie Mellon University. He is also the Chief Technologist of OctoAI. He received his PhD. from the Paul G. Allen School of Computer Science & Engineering at the... Read More →
avatar for Jacques Pienaar

Jacques Pienaar

SWE, Google
Jacques Pienaar is a lead of the ML Compiler Systems Research team at Google Deepmind. In this role he focuses on accelerating and simplifying machine learning for high-performance model deployment across various architectures. He is one of the founders of MLIR, a founding member... Read More →
JA

Jason Ansel

Research Scientist, Meta
Jason Ansel is a Research Scientist at Meta AI and a technical lead for PyTorch compilers. He started the TorchDynamo and TorchInductor projects, which bring flexible graph capture and a high performance compiler to PyTorch 2. He received a Ph.D. from MIT CSAIL in 2014 with research... Read More →
Wednesday September 18, 2024 5:00pm - 5:30pm PDT
Festival Pavilion - Breakout Room B
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Audience
  • Slides Attached
  • Timezone

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.