PyTorch Conference 2024: Full Schedule

September 18-19, 2024
San Francisco, California
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC-7). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

9:12am PDT

Keynote: PyTorch Technical Deep Dive - Piotr Bialecki, NVIDIA; Peng Wu, Will Constable, Kartikay Khandelwal & Mengtao (Martin) Yuan, Meta

Wednesday September 18, 2024 9:12am - 10:12am PDT

Festival Pavilion - Keynote Room

This Deep Dive provides an update on PyTorch development since last conference and dives into the key new features coming in PyTorch 2.5 and beyond. We will explore how advancements across a number of PyTorch features combine to better support the full model development lifecycle across training, fine-tuning, and deployment.

Speakers

Piotr Bialecki

Director of Engineering, Deep Learning Frameworks, NVIDIA

Piotr joined PyTorch team at NVIDIA in 2019 and currently manages the team. He drives NVIDIA's effort in maintaining and advancing PyTorch's CUDA backend and received the PyTorch SUPERHERO award in 2023 for his community contributions especially in the PyTorch discussion board... Read More →

Peng Wu

Engineering Manager, Meta

Dr. Peng Wu is the engineering manager of the PyTorch Compiler team at Meta. Dr. Wu spent over a decade at IBM research, working on many aspects of programming systems. She then founded the Programming Technologies Lab at Huawei and led its growth for six years. At Meta, she... Read More →

Will Constable

engineer, meta

Will Constable works on PyTorch Distributed Algorithms and Infrastructure at Meta as an IC and Tech Lead. Previously, he worked at Intel and Nervana Systems on different parts of the Deep Learning SW stack including Compiler Frontends, Integrations to TensorFlow and PyTorch, Distributed... Read More →

Kartikay Khandelwal

Software Engineer, PyTorch, Meta

Kartikay Khandelwal is a software engineer in the PyTorch and AI Infra team at Meta where he leads the development of the PyTorch ecosystem for Generative AI, including open-source libraries like torchtune for LLM fine-tuning and torchchat for LLM inference. Prior to PyTorch, he worked... Read More →

Mengtao (Martin) Yuan

Tech Lead Manager, Meta

Mengtao (Martin) Yuan is a Tech Lead Manager in Meta’s PyTorch Edge team. With multiple years of experience in the AI industry, Mengtao is focused at building software systems to help AI researchers and engineers to deploy their models on edge devices such as mobile phones, AR/VR... Read More →

PTC 2024 PyTorch Technical Keynote Slides pdf

Wednesday September 18, 2024 9:12am - 10:12am PDT
Festival Pavilion - Keynote Room

Keynote Sessions

Slides Attached Yes

11:10am PDT

Meta Llama 3 and the Future of Responsible AI Development - Spencer Whitman & Vincent Gonguet, Meta

Wednesday September 18, 2024 11:10am - 11:35am PDT

Gateway Pavilion - Cowell Theater

As AI models become increasingly powerful and pervasive, trust and safety have become top priorities. Join us for a timely talk on Llama 3, our latest foundation model, and the cutting-edge trust and safety models and tools we've developed to ensure responsible AI development. In this talk, we'll dive into: •The advancements of Llama 3 and its applications •Our innovative trust and safety approaches, including toxicity detection and mitigation •The open-source tools and resources we're sharing to empower the community Discover how Meta is pushing the boundaries of trust and safety and learn how you can integrate these solutions into your own projects. Let's build a safer, more responsible AI future together!

Speakers

Spencer Whitman

Product Manager (AI Security), Meta

Vincent Gonguet

Director, GenAI Trust & Safety, Meta

Meta Trust and safety pytorch conf24 pdf

Wednesday September 18, 2024 11:10am - 11:35am PDT
Gateway Pavilion - Cowell Theater

Breakout Sessions

Audience Any
Slides Attached Yes

11:25am PDT

Lightning Talk: Low Precision Dtypes in PyTorch - Vasiliy Kuznetsov, Meta

Wednesday September 18, 2024 11:25am - 11:35am PDT

Festival Pavilion - Breakout Room A

This talk deep dives into the new native PyTorch float8 training library, and previews PyTorch's strategy for supporting upcoming low precision dtypes such as float6, float4 and MX for efficient training and inference.

Speakers

Vasiliy Kuznetsov

software engineer, Meta

Software Engineer, PyTorch Core

PTC 2024 Low Precision Dtypes in PyTorch pdf

Wednesday September 18, 2024 11:25am - 11:35am PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Advanced
Slides Attached Yes

2:40pm PDT

Running State-of-Art Gen AI Models on-Device with NPU Acceleration - Felix Baum, Qualcomm

Wednesday September 18, 2024 2:40pm - 3:05pm PDT

Festival Pavilion - Breakout Room B

Since the boom of generative AI, the industry is now moving towards on-device AI inferencing, as it is not only a trend but a necessity now in order to save costs, achieve the best inference performance, ultra-low latency at the lowest power possible. In this session we go over the new features added on the Qualcomm AI Stack and how it works with the public release of ExecuTorch 1.0. We will discuss how to run traditional workloads as well as GenAI use cases including the latest version of Llama on the Mobile device while using Qualcomm Hexagon NPU.

Speakers

Felix Baum

Senior Director of Product Management, Qualcomm

Felix Baum has an extensive background of over two decades in the embedded industry, where he has excelled both as an embedded developer and a product manager. Currently he is responsible for AI Software Products at Qualcomm. Prior to that, he led efforts for various real-time operating... Read More →

Felix Baum PyTorch 2024 Session Running State of Art GenAI models on NPU pdf

Wednesday September 18, 2024 2:40pm - 3:05pm PDT
Festival Pavilion - Breakout Room B

Breakout Sessions

Audience Intermediate
Slides Attached Yes

2:40pm PDT

Sponsored Session: Accelerating AI Innovation: High Performance PyTorch at AMD - Robert Suderman & Ian Nordeng, AMD

Wednesday September 18, 2024 2:40pm - 3:05pm PDT

Festival Pavilion - Breakout Room A

Explore the powerful collaboration between AMD and PyTorch, driving advancements in AI and machine learning. Learn how AMD’s Day-0 PyTorch support delivers cutting-edge performance and seamless compatibility.

This session will highlight the technical synergies that make AMD hardware ideal choice for PyTorch frameworks, with real-world examples of accelerated workflows and breakthrough AI applications. Attendees will gain insights into how this dynamic partnership is enabling researchers, developers, and data scientists to push the boundaries of innovation and achieve unprecedented results in AI projects.

Speakers

Robert Suderman

Engineering Manager, AMD

Rob Suderman manages front-end support with AMD’s SHARK AI group with a goal of pushing tier one support for as many ML compute languages as possible. This has included core work on Torch-mlir, JAX, TOSA, and StableHLO, including being a founding team member on the IREE project... Read More →

Ian Nordeng

Manager Software Development, AMD

Ian Nordeng is a manager within AMD’s AIG-Sharks group where he spearheads machine learning model development for IREE’s compiler consumption to enable AI workloads to efficiently run across AMD’s hardware portfolio. He has been working in the AI compiler space for the past... Read More →

shark v2 Final 09.17.24 pdf

Wednesday September 18, 2024 2:40pm - 3:05pm PDT
Festival Pavilion - Breakout Room A

Breakout Sessions

Slides Attached Yes

3:10pm PDT

TorchInductor CPU Backend Advancements: New Features and Performance Improvements - Jiong Gong & Leslie Fang, Intel

Wednesday September 18, 2024 3:10pm - 3:35pm PDT

Festival Pavilion - Breakout Room B

This presentation provides an update on the latest advancements in the TorchInductor CPU backend since the last conference to bring best-in-class CPU performance for broad DL workloads. We will discuss new features and performance enhancements, including: • Max-autotune support with codegen for GEMMs, boosting performance for GEMM-related operations • Enhanced vectorized codegen support, now covering all data types beyond floating points with flexible vector factors, and optimized loop scheduling • Comprehensive quantization support, including weight-only-quantization (WoQ), and optimizations for dynamic quantization and quantization-aware training • Improved Attention support, featuring attention masks and optimizating SoftMax via flash attention v2 etc. • AOTInductor support, enabling high-performance inference with frozen weights • Native Windows support, with improved vectorization capabilities These advancements, combined with ongoing optimizations, have resulted in significant performance improvements since PyTorch 2.1, demonstrated through extensive benchmarks and large language models (LLMs).

Speakers

Leslie Fang

Software Engineer, Intel

Leslie is a software engineer from Intel who works on PyTorch performance optimization on X86 servers for the past 4 years. Currently, he is mainly focusing on the feature domain of Quantization, Autocast, and Inductor CPP/OpenMP backend in Stock PyTorch.

Jiong Gong

Principle Engineer, Intel

Jiong is a software architect from Intel who works on PyTorch framework optimizations. He is the PyTorch module maintainer for CPU and compiler.

TorchInductor CPU Backend Advancements New Features and Performance Improvements 20240914 pdf

TorchInductor CPU Backend Advancements New Features and Performance Improvements 20240915 pptx

Wednesday September 18, 2024 3:10pm - 3:35pm PDT
Festival Pavilion - Breakout Room B

Breakout Sessions

Audience Intermediate
Slides Attached Yes

3:25pm PDT

Lightning Talk: Extending PyTorch with Custom Python/C++/CUDA Operators - Richard Zou, Meta

Wednesday September 18, 2024 3:25pm - 3:35pm PDT

Festival Pavilion - Breakout Room A

In this talk, we'll go over the new recommended APIs to extend PyTorch with custom Python/C++/CUDA operators. Users have been able to extend PyTorch with custom operators for years but we have updated our guidance for creating custom operators that compose with torch.compile, autograd, and other PyTorch subsystems.

Speakers

Richard Zou

Software Engineer, Meta

I'm a software engineer at Meta working on PyTorch. I'm one of the creators of functorch, JAX-like composable function transforms for PyTorch. Nowadays I spend my time working on torch.compile, figuring out how to add infra changes to make it easier for PyTorch features like custom... Read More →

PTC 2024 Extending PyTorch with Custom Operators pdf

Wednesday September 18, 2024 3:25pm - 3:35pm PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Intermediate
Slides Attached Yes

3:25pm PDT

Lightning Talk: Introduction to Torch.Distributed.Pipelining - Howard Huang & Ke Wen, Meta

Wednesday September 18, 2024 3:25pm - 3:35pm PDT

Gateway Pavilion - Cowell Theater

Pipeline parallelism is a technique employed in distributed deep learning that enhances model execution by dividing the model into distinct segments, or "stages." As large language models and other memory-intensive models become more common, pipeline parallelism has grown increasingly important for several key areas: - Executing large-scale training jobs. - Enhancing performance in bandwidth-limited clusters. - Supporting large model inference. In this talk, we will introduce the `torch.distributed.pipelining` package which provides users a seamless way of applying pipeline parallelism. We will demonstrate the following features: - Splitting of model code based on simple specification. - Support for pipeline schedules, including GPipe, 1F1B, Interleaved 1F1B and Looped BFS, and providing the infrastructure for writing customized schedules. - Composability with other PyTorch parallel techniques such as data parallel (DDP, FSDP) or tensor parallel. - Out of the box integration with Hugging Face models for efficient inference.

Speakers

Howard Huang

Software Engineer, Meta

Howard Huang is a software engineer at Meta. He has been working on PyTorch and the PyTorch distributed team for the past 4 years.

Ke Wen

Software Engineer, Meta

Ke Wen is a software engineering at Meta. He works on PyTorch Distributed features, including pipeline parallelism, distributed inference, and graph-based analysis.

Introduction to torch.distributed.pipelining pdf

Introduction to torch.distributed.pipelining pptx

Wednesday September 18, 2024 3:25pm - 3:35pm PDT
Gateway Pavilion - Cowell Theater

Lightning Talks

Audience Intermediate
Slides Attached Yes

4:00pm PDT

[HALIDE] A Halide Backend for TorchInductor - Jason Ansel, Meta

Wednesday September 18, 2024 4:00pm - 4:10pm PDT

Festival Pavilion - Breakout Room B

This talk will focus on a new Halide backend for TorchInductor, which is in addition to the existing Triton and C++ backends. The Halide backend is meant to serve as a reference backend to make it easier to extend TorchInductor to support new backend compilers and hardware devices. Halide has been the inspiration (either in ideas or through forking) of numerous other compiler projects, so it is a good starting point for adding new backends that follow a Halide-like model.

Speakers

Jason Ansel

Research Scientist, Meta

Jason Ansel is a Research Scientist at Meta AI and a technical lead for PyTorch compilers. He started the TorchDynamo and TorchInductor projects, which bring flexible graph capture and a high performance compiler to PyTorch 2. He received a Ph.D. from MIT CSAIL in 2014 with research... Read More →

PT2 + Halide @ PTC 2024 pdf

Wednesday September 18, 2024 4:00pm - 4:10pm PDT
Festival Pavilion - Breakout Room B

DL Compiler Mini-Summit

Slides Attached Yes

4:35pm PDT

torchtune: Easy and Accessible Finetuning in Native PyTorch - Evan Smothers, Meta

Wednesday September 18, 2024 4:35pm - 4:45pm PDT

Festival Pavilion - Breakout Room A

As open-source LLMs have become more capable, a substantial ecosystem has developed around the fine-tuning of these models. A thriving community of researchers, developers, practitioners and hobbyists has emerged which focuses on topics ranging from memory efficiency, parameter-efficient fine-tuning and quantization to performance at scale and reproducible evaluations. The goal of this mini-summit is to bring this community together to discuss ideas, share knowledge and build connections.

The agenda features a keynote from Joe Spisak on the state of the Llama ecosystem followed by invited talks from the founders of Axolotl, Unsloth and torchtune. We conclude the summit with a riveting discussion on what’s next for LLMs, fine-tuning and the PyTorch ecosystem with a fabulous panel of experts - Tim Dettmers (author of bitsandbytes and QLoRA), Hailey Schoelkopf (maintainer of LM Eval Harness at EleutherAI), Aakanksha Chowdhery (Lead author on PaLM and Gemini) and Alexis Conneau (Research Lead at OpenAI)

Speakers

Evan Smothers

Software Engineer, Meta

Evan is a software engineer on the PyTorch Domains team at Meta. He currently works on torchtune, a PyTorch library for memory-efficient fine-tuning of large language models. Prior to joining Meta, Evan worked as a data scientist at Uber and received his Ph.D. in mathematics from... Read More →

torchtune pdf

Wednesday September 18, 2024 4:35pm - 4:45pm PDT
Festival Pavilion - Breakout Room A

Fine Tuning Mini-Summit

Slides Attached Yes

5:30pm PDT

Poster Presentations

Wednesday September 18, 2024 5:30pm - 8:30pm PDT

Gateway Pavilion - Sponsor Showcase

Purge the GIL: Improved Torch.DataLoader - Michal Szolucha & Rostan Tabet, NVIDIA
XFormers - Daniel Haziza, Meta AI
TritonCC: AOT Triton Workflow for TorchScript C++ Runtime - Sijia Chen & Huamin Li, Meta
The PyTorch 2.0 Inference Story - Angela Yi, Bin Bao, Sheng Qin & Sherlock Huang, Meta
Tensor Subclasses with PT2 - Brian Hirsh, Meta
Streamlining PyTorch Eager Mode Support on New Hardware Backends Through Torch.Compile - Eikan Wang, Intel
Sparsifying Vision Transformers with Minimal Accuracy Loss - Jesse Cai, Meta
Real-Time Art Creation: Stable Diffusion Fine-Tuning Techniques on Gaudi with PyTorch - Alex Sin & Louie Tsai, Intel Corporation
Quantization via AI Edge Torch - Pauline Sho, Google LLC
PyTorch Korea User Group: Introduction & Encourage - Junghwan Park, PyTorch Korea User Group & Hyoyoung Chang, Freelancer
PyTorch + MAX + Mojo - Nick Kreeger & Jack Clayton, Modular
PT2 Torch.Compile and CPython - William Wen, Meta
PT2 Cold and Warm Compile Time Improvements in Torch.Compile - Oguz Ulgen & Animesh Jain, Meta
Pre-Train Llama3 Models Using Meta's Torchtitan on Amazon SageMaker - Less Wright, Meta & Roy Allela, AWS
Optimizing Memory and Compilation with While_loop - Manfei Bai, Google
Non-Linear Quantization Functions for Machine Learning Models - Diogo Emanuel da Costa Venâncio, INESC-ID
Nested Tensors for Ragged Data Handling - Joel Schlosser, Meta
`Torch.Tensor.Module_load` and Tensor Subclass Serialization - Mikayla Gawarecki, Meta Platforms
Accelerating Generative AI on Ubiquitous CPU Instances with Native PyTorch - Mingfei Ma, Intel
Addressing Reverse Kinematics Challenges and Geometric Optimization in Robotics with PyTorch - Blair Birdsell, PhD. Student at University of Alberta
Blazingly Fast LLM Inference with Native PyTorch: Update from the Past Year - Yanbo Liang & Horace He, Meta
Boosting in-Browser ML: Accelerate PyTorch Generative Models for the Web - Emma Ning & Kshama Pawar, Microsoft; Joshua Lochner, Hugging Face
Democratizing AI, One Byte at a Time: The Bitsandbytes Open-Source Saga, Ft. FSDP+QLoRA Fine-Tuning - Titus von Koeller, Hugging Face
Depyf: A Tool to Help Write Code in a Torch.Compile-Friendly Way Through Decompilation - Kaichao You, Tsinghua University/UC Berkeley
Exploiting on-Chip AI Accelerator for High-Performance LLM Inference - Hiroshi Inoue & Tabari Alexander, IBM Research - Tokyo
ExecuTorch Android and IOS on-Device Demo Poster - Hansong Zhang, Meta
Fault Tolerance for Large Scale Training - Tristan Rice & Chirag Pandya, Meta
FP8 State of the Art Inference Performance with Pytorch - Chih-Chieh Yang & Adnan Hoque, IBM; Antoni Viros i Martin, IBM Research
From FSDP to DeepSpeed and Back Again - Yu Chin Fabian Lim, IBM Research, Singapore
Large Scale Transformer Model Training with PyTorch Tensor Parallel API - Tianyu Liu, Meta
Model Explorer - Visualizing Pytorch Models - Na Li & Eric Yang, Google
PT-D Zero Overhead Checkpointing - Lucas Pasqualin, Meta / PyTorch; Chien-Chin Huang & Iris Zhang, Meta
PyTorch Performance Debugging in N-Dimensional Parallelism - Wei Sun & Sreen Tallam, Meta
Unlock Up to 5x Faster Inference in PyTorch: Recent Innovations in Torch-TensorRT - Laikh Tewari, NVIDIA
Torch-Monitor: A Comprehensive Call Path Profiling Tool for PyTorch - Qidong Zhao, North Carolina State University & Hao Wu, George Mason University

Speakers

Hao Wu

PhD, George Mason University

I am interested in deep learning profiler

Yanbo Liang

software engineer, Meta

I'm software engineer at PyTorch team working on torch.compile and LLM.

Titus Von Koeller

ML engineer / lead maintainer bitsandbytes, Hugging Face

Titus, lead maintainer of the independent non-profit bitsandbytes (sponsored by Hugging Face), works on co-engineering the democratization of AI and in his free time cherishes electronic music, queer culture and ski mountaineering. With degrees in Psychology and Computer Science... Read More →

Angela Yi

Software Engineer, Meta

I've been working on the PyTorch Compilers team for the past 2 years, mainly working on torch.export!

Animesh Jain

Software Engineer, Meta

Animesh Jain works on PyTorch compilers.

Antoni Viros i Martin

Research Scientist, IBM Research

Antoni is currently a Research Scientist at IBM Research, investigating optimization approaches for ML inference and training, with a focus on open-source technologies such as PyTorch. He holds a PhD in Aerospace Engineering from Texas A&M University, and has previously worked at... Read More →

Bin Bao

Software Engineer, Meta

Bin Bao is a software engineer working with the PyTorch Compiler team at Meta. He focuses on developing AOTInductor, an Ahead-of-Time compiler for the PyTorch2 export path.

Daniel Haziza

Research Engineer, Meta AI

Daniel is a Research Engineer working at FAIR Paris on workloads efficiency, and developing the xFormers library

Diogo Venâncio

Researcher, University of Lisbon | INESC-ID

My name is Diogo and I am a Master's student at IST in Lisbon, Portugal and also a ML Engineer at an early stage AI startup. I grew up in the suburbs of Lisbon and always strived to have a positive impact on the lives of others. At the age of 20, I built my own company, called OutGoing... Read More →

Eikan Wang

AI Frameworks Engineer, Intel

Eikan is a staff engineer from Intel and a DL framework tech lead having full-stack experience in DL, from various AI applications to framework, library, and DL compiler. He is actively optimizing on torch.compile stack for Intel platforms, including optimizing Inductor C++/OpenMP... Read More →

Emma Ning

Principal PM, Microsoft

Emma Ning is a Principal PM in the Microsoft AI Framework team, focusing on AI model operationalization and acceleration with ONNX Runtime/Olive for open and interoperable AI. She has more than five years of product experience in search engines taking advantage of machine learning... Read More →

Iris Zhang

Software Engineer, Meta

PyTorch Distributed @ Meta

Junghwan Park

Lead maintainer @ PyTorch Korea User Group, PyTorch Korea User Group

- Data engineer at telecommunication company in Korea - Lead maintainer at PyTorch Korea User Group - Interested in open-source, community and time-series forecasting

Kshama Pawar

Principal Program Manager, Microsoft Corporation

Kshama Pawar is a Program Manager on the AI Platform team at Microsoft. She helps drive Training initiatives for both large language models and on-device training through optimization engines like ONNX Runtime. She is also involved in the Triton community effort to improve developer... Read More →

Laikh Tewari

Deep Learning Software Product Manager, NVIDIA

Laikh Tewari manages products for inference in deep learning frameworks at NVIDIA and focuses on the usability of performance optimization tools across data center, consumer, and embedded segments. Laikh received his B.S. and M.S. in computer science from Stanford University where... Read More →

Mingfei Ma

Senior Software Engineer, Intel

Mingfei Ma is a senior deep learning software engineer in Intel. He is also the maintainer of CPU performance module in PyTorch. Mingfei holds a Master degree from Harbin Institute of Technology where he majored in Control Science and Technology. Mingfei has a 12 years’ experience... Read More →

Chien-Chin Huang

Software Engineer, Meta

Software Engineer, PyTorch Distributed, Meta

Mikayla Gawarecki

Software Engineer, Meta Platforms

Software Engineer at Meta on PyTorch Core Team

Baihan Huang

Software Engineer, Meta

Working on PyTorch

KaiChao YOU

Ph.D. student, Tsinghua University/UC Berkeley

Kaichao You is a four-th year Ph.D. student from Tsinghua University. He is currently visiting UC Berkeley, working on the vLLM project, a high-throughput and memory-efficient inference and serving engine for LLMs. He is an open-source contributor to PyTorch/Triton, and he leads the... Read More →

Brian Hirsh

Software Engineer, Meta

Brian is a software engineer at Meta working on PyTorch core and compilers.

Jesse Cai

Software Engineer, Meta

Jesse is a software engineer on the PyTorch Core Performance team, where he works on accelerating models with sparsity. Before joining Meta, he worked at several startups, focusing on natural language processing.

Pauline Sho

Software Engineer, Google

Software engineering at Google LLC currently focused on improving the quantization infrastructure for edge devices.

Alex Sin

AI Software Solutions Engineer, Intel

Louie Tsai

AI SW Engineer, Intel

Horace He

Software Engineer, Meta

To be filled

Adnan Hoque

Research Engineer, IBM

I am a Research Engineer at IBM. I have a Bachelor of Science degree in Electrical Engineering from the University of Alberta. I have worked on machine learning applications in various domains such as computer vision, network security and most recently have been developing kernels... Read More →

Blair Birdsell

Data Scientist, Surespan Construction

Blair Birdsell has a MASc in Civil Engineering from the University of Victoria. This background integrates his design and engineering expertise with data science. Over 9 years, Blair has contributed to 4.86 million sq. ft. of building projects and now develops data-driven software... Read More →

Chih-Chieh Yang

Research Scientist, IBM

Performance optimization of AI workloads

Chirag Pandya

Software Engineer, Meta

Chirag is backend engineer who's worked for over 20 years in the Software industry. His expertise includes Networks/Storage/Security and Distributed Systems with emphasis on building fast, secure and performant systems.

Hansong Zhang

Software Engineer, Meta Platforms

Software Engineer at Meta. Worked on integrating ExecuTorch framework into Android apps with Java and JNI library.

Hiroshi Inoue

Research Staff Member, IBM Research - Tokyo

Hiroshi Inoue is a research staff member at IBM Research - Tokyo, where he works on performance optimization of system software. He has a PhD from the University of Tokyo.

Huamin Li

Software Engineer, Meta

Software engineer from Meta PyTorch, focusing on GPU and CPU inference for Meta internal workloads

Hyoyoung Chang

Lead maintainer, PyTorch Korea User Group

Data Engineer

Jack Clayton

AI Developer Advocate, Modular

Jack started his career optimizing autonomous truck software for leading mining companies, including BHP and Caterpillar. Most recently he was designing computer vision software, putting AI inference pipelines into production for IDVerse. He is passionate about the developer community... Read More →

Joel Schlosser

Software Engineer, Meta

Engineer with a decade's worth of ML experience across the research, industry, and framework perspectives.

Joshua Lochner

Machine Learning Engineer, Hugging Face

Bringing the power of machine learning to the web. Currently working on Transformers.js (@huggingface 🤗)

Less Wright

PyTorch Partner Engineer, Meta

PyTorch Distributed and Cuda/Triton kernels

Lucas Pasqualin

ML Engineer, PyTorch (Meta)

Lucas has been developing Machine Learning Applications and Machine Learning infrastructure at scale for years, and has recently been focused on extending the product offering of PyTorch's Distributed Checkpointing stack.

Manfei Bai

Software Engineer, Google LLC

Manfei Bai is a software engineer at Google.

Michał Szołucha

Deep Learning Software Engineer, NVIDIA

During his work at NVIDIA, Michał gained vast experience in Deep Learning Software Development. He tackled challenges in training and inference, ranging from small-scale to large-scale applications, as well as user-facing tasks and highly-optimized benchmarks like MLPerf. Micha... Read More →

Na Li

Software Engineer, Google

Tech Lead Manager at Google Cloud, leading on-device ML developer tools.

Nick Kreeger

Frameworks Engineering Director, Modular

Software Engineering lead with over 15 years of experience working at Google, Microsoft and a handful of startups. Nick has contributed to many technologies in Machine Learning such as TensorFlow.js, TensorFlow Lite/Micro, and ONNX/ONNXRuntime. Nick enjoys spending his free time with... Read More →

Oguz Ulgen

Software Engineer, Meta

I'm a software engineer at Meta where I used to work on the Hack programming language and now work on PyTorch.

Rostan TABET

Software Engineer, NVIDIA

I am a Computer Science student with a passion for Python and deep learning. During my end-of-studies internship, I focused on leveraging free-threaded Python in the context of NVIDIA's deep learning libraries suite. My work aims to improve data handling efficiency in machine learning... Read More →

Roy Allela

Sr AI/ML Specialist Architect, AWS

Roy Allela is a Senior AI/ML Specialist Architect at AWS.Roy helps customers-from small startups to large enterprises-train and deploy large language models efficiently on AWS. He previously spent 8 years at Intel as a Senior AI Software Engineer working on low-level ML framework... Read More →

Sheng Qin

Software Engineer, Meta Inc.

Sheng Qin is a software engineer of PyTorch Accelerator Enablement org at Meta

Sijia Chen

Software Engineer, Meta / PyTorch

Sijia is a software engineer in Meta PyTorch Acceleration team, focusing on GPU inference area

Tianyu Liu

Research Scientist, Meta

Tianyu Liu is a Research Scientist on the PyTorch team at Meta, currently working on distributed training. Prior to this, he was a postdoc at Stanford University and has worked on the Ads Core Machine Learning team at Meta. He obtained his PhD degree at the University of Wisconsin--Madison... Read More →

Tristan Rice

Software Engineer, Meta

Software engineer working on PyTorch Distributed and large scale training.

Wei Sun

Research Scientist, Meta Platform

Wei Sun supports the Meta AI Infrastructure organization. He brings deep expertise in analyzing ML model execution during training and serving and identifies efficiency/performance bottlenecks across model and system architecture. This has led him to build some of the most comprehensive... Read More →

William Wen

Software Engineer, Meta Platforms, Inc.

William works on the torch.compile team, specializing in TorchDynamo.

Yu Chin Fabian Lim

Research Staff Member, IBM Research, Singapore

Fabian Lim is currently in IBM Research, Singapore. During 2013 - 2016, he worked in Avago Technologies (now Broadcom), then SK Hynix Memory Systems, in San Jose, CA. From 2010-2013, he was a postdoc at the Massachusetts Institute of Technology, Cambridge, MA. Dr Lim received the... Read More →

Tabari Alexander

STSM, IBM Z AI and Analytics, IBM

Eric Yang

Software Engineer, Google

Sreen Tallam

Software Engineering Manager - AI Performance & Efficiency, Meta

I am a SW Engineering Manager at Meta helping all ML Training & Serving models (RecSys, Content Understanding, GenAI) run optimally and efficiently through various optimization techniques, including scaling them across the entire Meta fleet.

Qidong Zhao

PHD Student, North Carolina State University

Research Interest:Profiling techniques for different workloads and architectures.

PTC 2024 Tensor Subclasses with PT2 pdf

Nested Tensors for Ragged Data Handling.pptx pdf

Wednesday September 18, 2024 5:30pm - 8:30pm PDT
Gateway Pavilion - Sponsor Showcase

Poster Presentations

Audience Intermediate
Slides Attached Yes

10:50am PDT

Lightning Talk: d-Matrix LLM Compression Flow Based on Torch.Fx: Simplifying PTQ/QAT - Zifei Xu & Tristan Webb, d-Matrix Corporation

Thursday September 19, 2024 10:50am - 11:00am PDT

Festival Pavilion - Breakout Room A

We introduce dmx-compressor, d-Matrix's open-source LLM compression toolkit that is modular, robust, efficient, and user-friendly. It utilizes symbolic tracing and fx.Transformer for network compression while keeping the model a first-class citizen in PyTorch for the user, despite prevalent graph dynamism in LLMs. It achieves this by maintaining both the original nn.Module and a just-in-time (JIT) traced and transformed fx.GraphModule representation behind the scenes, in conjunction with an abstraction that cleanly decouples network compression from the original model graph definition. This design allows the FXIR to dynamically adapt to diverse forward call signatures and flow-control arguments throughout quantization-aware training and post-training quantization written in plain PyTorch, yielding a compressed FXIR fully compatible with application-level APIs like the Hugging Face pipeline. We also provide a graph visualizer based on fx.Interpreter for ease of debugging. We believe this project shall empower the community to build efficient LLMs for deployment on custom hardware accelerators and contribute to the PyTorch ecosystem.

Speakers

Zifei Xu

Senior Machine Learning Research Engineer, d-Matrix Corporation

Zifei is a Senior Machine Learning Research Engineer at d-Matrix. Her current work focuses on developing model quantization pipelines and efficient quantization algorithms. She graduated from Stanford University with a Master's degree in Computational & Mathematical Engineering and... Read More →

Tristan Webb

ML Engineer, d-Matrix

Tristan's background is primarily in computer science and mathematics, and which let him to graduate with a PhD in Complexity Science at the University of Warwick, where he worked with large computational neuroscience models of spiking neural networks using simulators written in C... Read More →

dmx compressor pdf

dmx Compressor Pytorch Conference pptx

Thursday September 19, 2024 10:50am - 11:00am PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Intermediate
Slides Attached Yes

11:05am PDT

Lightning Talk: LLMs on Edge with AI Accelerators - Chen Lai, Kimish Patel & Cemal Bilgin, Meta

Thursday September 19, 2024 11:05am - 11:15am PDT

Festival Pavilion - Breakout Room A

LLMs are known to be compute heavy and consume lots of resources (almost all resources on phones), including memory and power. A natural thought is to leverage the AI hardware accelerators, for example, Apple Neural Engine (ANE) on Apple devices and HTP on Qualcomm SoCs, to make it run fast and efficiently. Only by optimizing the model latency, memory consumption and power usage to a certain level will users be interested in installing the models on their devices. In this session, we’d like to introduce how we leverage these AI accelerators within the PyTorch ecosystem to achieve the state-of-art performance for llama3 on device, via ExecuTorch and the partnership with Apple and Qualcomm. Hardware companies usually have their own AI accelerators. Likely they have different characteristics, one may support a list of different operators than others, and one may only support static shapes (like HTP). However, transformers-based optimization can be generic. We’ll discuss in more detail how we apply the generic optimization as well as the backend specific optimization. The techniques we applied here are not just for LLMs, but can be applied to other transformer-based models.

Speakers

Kimish Patel

Software Engineer, Meta Platforms

Kimish has worked on enabling PyTorch on Meta's family of apps, primarily focusing on performance optimizations. His past experiences include hardware/software co-design, CPU architecture, and CPU/GPU performance optimization.

Chen Lai

Software Engineer, Meta

Software engineers focusing on bringing up accelerators on devices

CEMAL Bilgin

Engineering Manager, Meta

Engineering Manager PyTorch Edge Acceleration

LLMs on Edge with AI Accelerators pdf

Thursday September 19, 2024 11:05am - 11:15am PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Intermediate
Slides Attached Yes

11:20am PDT

Lightning Talk: Building and Supporting the Chinese PyTorch Community: Resources, Tutorials, and Engagement - Zong Zesheng, Huawei

Thursday September 19, 2024 11:20am - 11:30am PDT

Gateway Pavilion - Cowell Theater

Description: This proposal aims to provide a comprehensive introduction to the Chinese PyTorch community, we hope to inspire more users to join and contribute, fostering a vibrant and inclusive environment for PyTorch enthusiasts in China. Chinese PyTorch Homepage Introduction to the official Chinese version of the PyTorch website, highlighting its features. Navigation tips and key sections, such as documentation, tutorials, and community events. Improve the connection of users from China with PyTorch Community. Localized Tutorials and Documentation The 2.x version not have Translated version, it hard to catch up with latest features of PyTorch if the beginner not good at English. We translated official documents and tutorials, covering everything from basic PyTorch concepts to advanced applications. Interactive tutorials No interactive tutorials(Like Google Colab) for Chinese students or beginners before, they have to setup environment before start with PyTorch, which might be hard for beginners. And now, an online notebook & tutorials are available to practice or tuning steps for beginners.

Speakers

zong zesheng

Software Engineer, Huawei

Currently, trying to let Chinese users to have easier access to PyTorch resources and make a friendly user experiences for beginners.

lightningtalk zezheng zong pdf

Thursday September 19, 2024 11:20am - 11:30am PDT
Gateway Pavilion - Cowell Theater

Lightning Talks

Audience Beginner
Slides Attached Yes

11:20am PDT

Sponsored Session: Torchchat: A Showcase of PyTorch LLM Ubiquity - Jack Khuu & Jesse White, Meta

Thursday September 19, 2024 11:20am - 11:45am PDT

Festival Pavilion - Breakout Room A

This talk explores the journey of enabling LLMs in the PyTorch ecosystem, as well as how the teams behind AOT Inductor, ExecuTorch, and torchao collaborated to create torchchat, a showcase of PyTorch’s ability to run LLM inference everywhere.

Torchchat demonstrates the ubiquity, simplicity, and quality of PyTorch’s LLM support through performant, reproducible implementations for not only Python environments, but on desktop, server, and on-device as-well.

All of our work is open source and available on GitHub.

Speakers

Jack Khuu

Software Engineer, Meta

Software Engineer @ Meta working on the PyTorch Edge team. TL for torchchat, which is PyTorch's showcase of LLM inference ubiquity (Python, Desktops, Mobile, etc.). More broadly, I focus on the "Experience" of PyTorch Edge, encompassing User, Developer, and Community Experience.Ex-Lecturer... Read More →

Jesse White

Software Engineering Manager, Meta

Jesse is an engineering manager at PyTorch @ Meta, where he supports the Edge Experience team in improving the experience for on-device inference and training, including mobile, laptops, and embedded devices. With nearly 20 years of experience in startups, Jesse is passionate about... Read More →

torchchat pytorch conf24 pptx

Thursday September 19, 2024 11:20am - 11:45am PDT
Festival Pavilion - Breakout Room A

Breakout Sessions

Audience Intermediate
Slides Attached Yes

12:00pm PDT

Lightning Talk: Optimized PyTorch Inference on aarch64 Linux CPUs - Sunita Nadampalli, Amazon (AWS)

Thursday September 19, 2024 12:00pm - 12:10pm PDT

Festival Pavilion - Breakout Room B

In the last 2 years we've optimized performance of PyTorch on Arm processors. The optimizations have included changes to ATen, C10, MKLDNN operators, GEMM backend, and Torch inductor. In many cases instead of writing our own kernel we integrated the Arm compute library, used fastmath kernels with format types like bf16, implemented operator caching, selected optimal backend based on the input context etc. Through these optimizations we improved performance by over 2x. In this presentation first we will talk about how we went across this process, what those optimizations are, performance numbers for AWS Graviton3 processors for around 75 models, and CI/CD workflow details. Next, we will walk through a sample PyTorch application showing basic usage, how to tune runtime and the resulting speed up. At the end of the presentation attendees will learn about PyTorch performance optimizations on Arm processors, how to use them, and the areas where they can collaborate to further improve PyTorch for aarch64 CPUs.

Speakers

Sunita Nadampalli

Software Development Manager, Amazon/AWS

Sunita Nadampalli is a Software Development Manager at AWS. She leads Graviton software performance optimizations for AI/ML and HPC workloads. She is passionate about open source software development and delivering high-performance and sustainable software solutions with Arm SoCs... Read More →

pytorch conf24 aarch64 Linux sunita nadampalli pdf

Thursday September 19, 2024 12:00pm - 12:10pm PDT
Festival Pavilion - Breakout Room B

Lightning Talks

Audience Any
Slides Attached Yes

12:10pm PDT

Lightning Talk: Implementing and Using Iterable Datasets: What Could Go Wrong? - Nicolas Hug, Meta

Thursday September 19, 2024 12:10pm - 12:20pm PDT

Gateway Pavilion - Cowell Theater

PyTorch supports two kinds of datasets: Iterable datasets and indexable "map-style" datasets. Iterable datasets can be more flexible and potentially faster than their indexable cousins. They are also much harder to use correctly, and can easily lead to silently wrong results. This talk is a quick and fun intro to some of the traps that Iterable datasets lay out for you, with some tips to help you avoid them.

Speakers

Nicolas Hug

Research Engineer, Meta

Nicolas is a software engineer in the PyTorch team at Meta, where he mainly contributes to the torchvision library. Prior to that, Nicolas was a research scientist at Columbia University, where he became part of the scikit-learn core development team. Nicolas holds a PhD in machine... Read More →

What could go wrong pdf

Thursday September 19, 2024 12:10pm - 12:20pm PDT
Gateway Pavilion - Cowell Theater

Lightning Talks

Audience Intermediate
Slides Attached Yes

4:35pm PDT

Unlocking the Enigma: Crafting Unbiased, Transparent, and Explainable Large Language Models - Rashmi Nagpal, Patchstack

Thursday September 19, 2024 4:35pm - 5:00pm PDT

Festival Pavilion - Breakout Room A

In an era where artificial intelligence reigns supreme, the statistics are both perplexing and thought-provoking – only a mere 13% of large language models manage to transcend the realms of research and enter the practical world of production. Who bears the responsibility when these models err, spewing out biased or discriminatory outputs? It's time to demystify the complex landscape of machine learning ethics and carve a path towards a brighter, more accountable future! In this talk, firstly, we will navigate the profound impacts of large language models across diverse domains, from the lifesaving advances in medicine to safeguarding our nations through enhanced security protocols. Secondly, as we marvel at data-driven decisions laid by these models, we will confront the darker shadows cast by – the looming spectre of bias in the data. Finally, we will delve deep into the art of building interpretable models and navigating the maze of ethical considerations. Through a live demonstration in PyTorch, we will witness how to craft unbiased, transparent, and explainable models.

Speakers

Rashmi Nagpal

Machine Learning Engineer, Patchstack

Rashmi, a passionate researcher at the MIT CSAIL and machine learning engineer at Patchstack, is dedicated to crafting beautiful AI applications. With nearly 5 years of industrial experience, she has brought ideas to life at pre-seed startups and contributed to impactful redesigns... Read More →

PyTorch Conference Rashmi Nagpal.pptx pdf

Thursday September 19, 2024 4:35pm - 5:00pm PDT
Festival Pavilion - Breakout Room A

Breakout Sessions

Audience Intermediate
Slides Attached Yes