Loading…
Attending this event?
September 18-19, 2024
San Francisco, California
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC-7). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Wednesday, September 18
 

9:12am PDT

Keynote: PyTorch Technical Deep Dive - Piotr Bialecki, NVIDIA; Peng Wu, Will Constable, Kartikay Khandelwal & Mengtao (Martin) Yuan, Meta
Wednesday September 18, 2024 9:12am - 10:12am PDT
This Deep Dive provides an update on PyTorch development since last conference and dives into the key new features coming in PyTorch 2.5 and beyond.  We will explore how advancements across a number of PyTorch features combine to better support the full model development lifecycle across training, fine-tuning, and deployment.
Speakers
avatar for Piotr Bialecki

Piotr Bialecki

Director of Engineering, Deep Learning Frameworks, NVIDIA
Piotr joined PyTorch team at NVIDIA in 2019 and currently manages the team.  He drives NVIDIA's effort in maintaining and advancing PyTorch's CUDA backend and received the PyTorch SUPERHERO award in 2023 for his community contributions especially in the PyTorch discussion board... Read More →
avatar for Peng Wu

Peng Wu

Engineering Manager, Meta
Dr. Peng Wu is the engineering manager of the PyTorch Compiler team at Meta.  Dr. Wu spent over a decade at IBM research, working on many aspects of programming systems.  She then founded the Programming Technologies Lab at Huawei and led its growth for six years.  At Meta, she... Read More →
avatar for Will Constable

Will Constable

engineer, meta
Will Constable works on PyTorch Distributed Algorithms and Infrastructure at Meta as an IC and Tech Lead.  Previously, he worked at Intel and Nervana Systems on different parts of the Deep Learning SW stack including Compiler Frontends, Integrations to TensorFlow and PyTorch, Distributed... Read More →
avatar for Kartikay Khandelwal

Kartikay Khandelwal

Software Engineer, PyTorch, Meta
Kartikay Khandelwal is a software engineer in the PyTorch and AI Infra team at Meta where he leads the development of the PyTorch ecosystem for Generative AI, including open-source libraries like torchtune for LLM fine-tuning and torchchat for LLM inference. Prior to PyTorch, he worked... Read More →
avatar for Mengtao (Martin) Yuan

Mengtao (Martin) Yuan

Tech Lead Manager, Meta
Mengtao (Martin) Yuan is a Tech Lead Manager in Meta’s PyTorch Edge team. With multiple years of experience in the AI industry, Mengtao is focused at building software systems to help AI researchers and engineers to deploy their models on edge devices such as mobile phones, AR/VR... Read More →
Wednesday September 18, 2024 9:12am - 10:12am PDT
Festival Pavilion - Keynote Room
  Keynote Sessions
  • Slides Attached Yes

11:10am PDT

Meta Llama 3 and the Future of Responsible AI Development - Spencer Whitman & Vincent Gonguet, Meta
Wednesday September 18, 2024 11:10am - 11:35am PDT
As AI models become increasingly powerful and pervasive, trust and safety have become top priorities. Join us for a timely talk on Llama 3, our latest foundation model, and the cutting-edge trust and safety models and tools we've developed to ensure responsible AI development. In this talk, we'll dive into: •The advancements of Llama 3 and its applications •Our innovative trust and safety approaches, including toxicity detection and mitigation •The open-source tools and resources we're sharing to empower the community Discover how Meta is pushing the boundaries of trust and safety and learn how you can integrate these solutions into your own projects. Let's build a safer, more responsible AI future together!
Speakers
SW

Spencer Whitman

Product Manager (AI Security), Meta
VG

Vincent Gonguet

Director, GenAI Trust & Safety, Meta
Wednesday September 18, 2024 11:10am - 11:35am PDT
Gateway Pavilion - Cowell Theater
  Breakout Sessions
  • Audience Any
  • Slides Attached Yes

11:25am PDT

Lightning Talk: Low Precision Dtypes in PyTorch - Vasiliy Kuznetsov, Meta
Wednesday September 18, 2024 11:25am - 11:35am PDT
This talk deep dives into the new native PyTorch float8 training library, and previews PyTorch's strategy for supporting upcoming low precision dtypes such as float6, float4 and MX for efficient training and inference.
Speakers
avatar for Vasiliy Kuznetsov

Vasiliy Kuznetsov

software engineer, Meta
Software Engineer, PyTorch Core
Wednesday September 18, 2024 11:25am - 11:35am PDT
Festival Pavilion - Breakout Room A
  Lightning Talks

2:40pm PDT

Running State-of-Art Gen AI Models on-Device with NPU Acceleration - Felix Baum, Qualcomm
Wednesday September 18, 2024 2:40pm - 3:05pm PDT
Since the boom of generative AI, the industry is now moving towards on-device AI inferencing, as it is not only a trend but a necessity now in order to save costs, achieve the best inference performance, ultra-low latency at the lowest power possible. In this session we go over the new features added on the Qualcomm AI Stack and how it works with the public release of ExecuTorch 1.0. We will discuss how to run traditional workloads as well as GenAI use cases including the latest version of Llama on the Mobile device while using Qualcomm Hexagon NPU.
Speakers
avatar for Felix Baum

Felix Baum

Senior Director of Product Management, Qualcomm
Felix Baum has an extensive background of over two decades in the embedded industry, where he has excelled both as an embedded developer and a product manager. Currently he is responsible for AI Software Products at Qualcomm. Prior to that, he led efforts for various real-time operating... Read More →
Wednesday September 18, 2024 2:40pm - 3:05pm PDT
Festival Pavilion - Breakout Room B
  Breakout Sessions

2:40pm PDT

Sponsored Session: Accelerating AI Innovation: High Performance PyTorch at AMD - Robert Suderman & Ian Nordeng, AMD
Wednesday September 18, 2024 2:40pm - 3:05pm PDT
Explore the powerful collaboration between AMD and PyTorch, driving advancements in AI and machine learning. Learn how AMD’s Day-0 PyTorch support delivers cutting-edge performance and seamless compatibility.

This session will highlight the technical synergies that make AMD hardware ideal choice for PyTorch frameworks, with real-world examples of accelerated workflows and breakthrough AI applications. Attendees will gain insights into how this dynamic partnership is enabling researchers, developers, and data scientists to push the boundaries of innovation and achieve unprecedented results in AI projects.

Speakers
avatar for Robert Suderman

Robert Suderman

Engineering Manager, AMD
Rob Suderman manages front-end support with AMD’s SHARK AI group with a goal of pushing tier one support for as many ML compute languages as possible. This has included core work on Torch-mlir, JAX, TOSA, and StableHLO, including being a founding team member on the IREE project... Read More →
avatar for Ian Nordeng

Ian Nordeng

Manager Software Development, AMD
Ian Nordeng is a manager within AMD’s AIG-Sharks group where he spearheads machine learning model development for IREE’s compiler consumption to enable AI workloads to efficiently run across AMD’s hardware portfolio. He has been working in the AI compiler space for the past... Read More →
Wednesday September 18, 2024 2:40pm - 3:05pm PDT
Festival Pavilion - Breakout Room A
  Breakout Sessions
  • Slides Attached Yes

3:10pm PDT

TorchInductor CPU Backend Advancements: New Features and Performance Improvements - Jiong Gong & Leslie Fang, Intel
Wednesday September 18, 2024 3:10pm - 3:35pm PDT
This presentation provides an update on the latest advancements in the TorchInductor CPU backend since the last conference to bring best-in-class CPU performance for broad DL workloads. We will discuss new features and performance enhancements, including: • Max-autotune support with codegen for GEMMs, boosting performance for GEMM-related operations • Enhanced vectorized codegen support, now covering all data types beyond floating points with flexible vector factors, and optimized loop scheduling • Comprehensive quantization support, including weight-only-quantization (WoQ), and optimizations for dynamic quantization and quantization-aware training • Improved Attention support, featuring attention masks and optimizating SoftMax via flash attention v2 etc. • AOTInductor support, enabling high-performance inference with frozen weights • Native Windows support, with improved vectorization capabilities These advancements, combined with ongoing optimizations, have resulted in significant performance improvements since PyTorch 2.1, demonstrated through extensive benchmarks and large language models (LLMs).
Speakers
avatar for Leslie Fang

Leslie Fang

Software Engineer, Intel
Leslie is a software engineer from Intel who works on PyTorch performance optimization on X86 servers for the past 4 years. Currently, he is mainly focusing on the feature domain of Quantization, Autocast, and Inductor CPP/OpenMP backend in Stock PyTorch.
avatar for Jiong Gong

Jiong Gong

Principle Engineer, Intel
Jiong is a software architect from Intel who works on PyTorch framework optimizations. He is the PyTorch module maintainer for CPU and compiler.
Wednesday September 18, 2024 3:10pm - 3:35pm PDT
Festival Pavilion - Breakout Room B
  Breakout Sessions

3:25pm PDT

Lightning Talk: Extending PyTorch with Custom Python/C++/CUDA Operators - Richard Zou, Meta
Wednesday September 18, 2024 3:25pm - 3:35pm PDT
In this talk, we'll go over the new recommended APIs to extend PyTorch with custom Python/C++/CUDA operators. Users have been able to extend PyTorch with custom operators for years but we have updated our guidance for creating custom operators that compose with torch.compile, autograd, and other PyTorch subsystems.
Speakers
avatar for Richard Zou

Richard Zou

Software Engineer, Meta
I'm a software engineer at Meta working on PyTorch. I'm one of the creators of functorch, JAX-like composable function transforms for PyTorch. Nowadays I spend my time working on torch.compile, figuring out how to add infra changes to make it easier for PyTorch features like custom... Read More →
Wednesday September 18, 2024 3:25pm - 3:35pm PDT
Festival Pavilion - Breakout Room A
  Lightning Talks

3:25pm PDT

Lightning Talk: Introduction to Torch.Distributed.Pipelining - Howard Huang & Ke Wen, Meta
Wednesday September 18, 2024 3:25pm - 3:35pm PDT
Pipeline parallelism is a technique employed in distributed deep learning that enhances model execution by dividing the model into distinct segments, or "stages." As large language models and other memory-intensive models become more common, pipeline parallelism has grown increasingly important for several key areas: - Executing large-scale training jobs. - Enhancing performance in bandwidth-limited clusters. - Supporting large model inference. In this talk, we will introduce the `torch.distributed.pipelining` package which provides users a seamless way of applying pipeline parallelism. We will demonstrate the following features: - Splitting of model code based on simple specification. - Support for pipeline schedules, including GPipe, 1F1B, Interleaved 1F1B and Looped BFS, and providing the infrastructure for writing customized schedules. - Composability with other PyTorch parallel techniques such as data parallel (DDP, FSDP) or tensor parallel. - Out of the box integration with Hugging Face models for efficient inference.
Speakers
avatar for Howard Huang

Howard Huang

Software Engineer, Meta
Howard Huang is a software engineer at Meta. He has been working on PyTorch and the PyTorch distributed team for the past 4 years.
avatar for Ke Wen

Ke Wen

Software Engineer, Meta
Ke Wen is a software engineering at Meta. He works on PyTorch Distributed features, including pipeline parallelism, distributed inference, and graph-based analysis.
Wednesday September 18, 2024 3:25pm - 3:35pm PDT
Gateway Pavilion - Cowell Theater
  Lightning Talks

4:00pm PDT

[HALIDE] A Halide Backend for TorchInductor - Jason Ansel, Meta
Wednesday September 18, 2024 4:00pm - 4:10pm PDT
This talk will focus on a new Halide backend for TorchInductor, which is in addition to the existing Triton and C++ backends.  The Halide backend is meant to serve as a reference backend to make it easier to extend TorchInductor to support new backend compilers and hardware devices.  Halide has been the inspiration (either in ideas or through forking) of numerous other compiler projects, so it is a good starting point for adding new backends that follow a Halide-like model.
Speakers
JA

Jason Ansel

Research Scientist, Meta
Jason Ansel is a Research Scientist at Meta AI and a technical lead for PyTorch compilers. He started the TorchDynamo and TorchInductor projects, which bring flexible graph capture and a high performance compiler to PyTorch 2. He received a Ph.D. from MIT CSAIL in 2014 with research... Read More →
Wednesday September 18, 2024 4:00pm - 4:10pm PDT
Festival Pavilion - Breakout Room B

4:35pm PDT

torchtune: Easy and Accessible Finetuning in Native PyTorch - Evan Smothers, Meta
Wednesday September 18, 2024 4:35pm - 4:45pm PDT
As open-source LLMs have become more capable, a substantial ecosystem has developed around the fine-tuning of these models. A thriving community of researchers, developers, practitioners and hobbyists has emerged which focuses on topics ranging from memory efficiency, parameter-efficient fine-tuning and quantization to performance at scale and reproducible evaluations. The goal of this mini-summit is to bring this community together to discuss ideas, share knowledge and build connections.

The agenda features a keynote from Joe Spisak on the state of the Llama ecosystem followed by invited talks from the founders of Axolotl, Unsloth and torchtune. We conclude the summit with a riveting discussion on what’s next for LLMs, fine-tuning and the PyTorch ecosystem with a fabulous panel of experts - Tim Dettmers (author of bitsandbytes and QLoRA), Hailey Schoelkopf (maintainer of LM Eval Harness at EleutherAI), Aakanksha Chowdhery (Lead author on PaLM and Gemini) and Alexis Conneau (Research Lead at OpenAI)
Speakers
avatar for Evan Smothers

Evan Smothers

Software Engineer, Meta
Evan is a software engineer on the PyTorch Domains team at Meta. He currently works on torchtune, a PyTorch library for memory-efficient fine-tuning of large language models. Prior to joining Meta, Evan worked as a data scientist at Uber and received his Ph.D. in mathematics from... Read More →
Wednesday September 18, 2024 4:35pm - 4:45pm PDT
Festival Pavilion - Breakout Room A

5:30pm PDT

Poster Presentations
Wednesday September 18, 2024 5:30pm - 8:30pm PDT
  • Purge the GIL: Improved Torch.DataLoader - Michal Szolucha & Rostan Tabet, NVIDIA
  • XFormers - Daniel Haziza, Meta AI 
  • TritonCC: AOT Triton Workflow for TorchScript C++ Runtime - Sijia Chen & Huamin Li, Meta
  • The PyTorch 2.0 Inference Story - Angela Yi, Bin Bao, Sheng Qin & Sherlock Huang, Meta
  • Tensor Subclasses with PT2 - Brian Hirsh, Meta
  • Streamlining PyTorch Eager Mode Support on New Hardware Backends Through Torch.Compile - Eikan Wang, Intel
  • Sparsifying Vision Transformers with Minimal Accuracy Loss - Jesse Cai, Meta
  • Real-Time Art Creation: Stable Diffusion Fine-Tuning Techniques on Gaudi with PyTorch - Alex Sin & Louie Tsai, Intel Corporation
  • Quantization via AI Edge Torch - Pauline Sho, Google LLC
  • PyTorch Korea User Group: Introduction & Encourage - Junghwan Park, PyTorch Korea User Group & Hyoyoung Chang, Freelancer
  • PyTorch + MAX + Mojo - Nick Kreeger & Jack Clayton, Modular 
  • PT2 Torch.Compile and CPython - William Wen, Meta
  • PT2 Cold and Warm Compile Time Improvements in Torch.Compile - Oguz Ulgen & Animesh Jain, Meta
  • Pre-Train Llama3 Models Using Meta's Torchtitan on Amazon SageMaker - Less Wright, Meta & Roy Allela, AWS
  • Optimizing Memory and Compilation with While_loop - Manfei Bai, Google
  • Non-Linear Quantization Functions for Machine Learning Models - Diogo Emanuel da Costa Venâncio, INESC-ID 
  • Nested Tensors for Ragged Data Handling - Joel Schlosser, Meta
  • `Torch.Tensor.Module_load` and Tensor Subclass Serialization - Mikayla Gawarecki, Meta Platforms
  • Accelerating Generative AI on Ubiquitous CPU Instances with Native PyTorch - Mingfei Ma, Intel
  • Addressing Reverse Kinematics Challenges and Geometric Optimization in Robotics with PyTorch - Blair Birdsell, PhD. Student at University of Alberta 
  • Blazingly Fast LLM Inference with Native PyTorch: Update from the Past Year - Yanbo Liang & Horace He, Meta
  • Boosting in-Browser ML: Accelerate PyTorch Generative Models for the Web - Emma Ning & Kshama Pawar, Microsoft; Joshua Lochner, Hugging Face
  • Democratizing AI, One Byte at a Time: The Bitsandbytes Open-Source Saga, Ft. FSDP+QLoRA Fine-Tuning - Titus von Koeller, Hugging Face
  • Depyf: A Tool to Help Write Code in a Torch.Compile-Friendly Way Through Decompilation - Kaichao You, Tsinghua University/UC Berkeley
  • Exploiting on-Chip AI Accelerator for High-Performance LLM Inference - Hiroshi Inoue & Tabari Alexander, IBM Research - Tokyo
  • ExecuTorch Android and IOS on-Device Demo Poster - Hansong Zhang, Meta
  • Fault Tolerance for Large Scale Training - Tristan Rice & Chirag Pandya, Meta
  • FP8 State of the Art Inference Performance with Pytorch - Chih-Chieh Yang & Adnan Hoque, IBM; Antoni Viros i Martin, IBM Research
  • From FSDP to DeepSpeed and Back Again - Yu Chin Fabian Lim, IBM Research, Singapore
  • Large Scale Transformer Model Training with PyTorch Tensor Parallel API - Tianyu Liu, Meta
  • Model Explorer - Visualizing Pytorch Models - Na Li & Eric Yang, Google
  • PT-D Zero Overhead Checkpointing - Lucas Pasqualin, Meta / PyTorch; Chien-Chin Huang & Iris Zhang, Meta
  • PyTorch Performance Debugging in N-Dimensional Parallelism - Wei Sun & Sreen Tallam, Meta
  • Unlock Up to 5x Faster Inference in PyTorch: Recent Innovations in Torch-TensorRT - Laikh Tewari, NVIDIA
  • Torch-Monitor: A Comprehensive Call Path Profiling Tool for PyTorch - Qidong Zhao, North Carolina State University & Hao Wu, George Mason University
Speakers
avatar for Hao Wu

Hao Wu

PhD, George Mason University
I am interested in deep learning profiler
avatar for Yanbo Liang

Yanbo Liang

software engineer, Meta
I'm software engineer at PyTorch team working on torch.compile and LLM.
avatar for Titus Von Koeller

Titus Von Koeller

ML engineer / lead maintainer bitsandbytes, Hugging Face
Titus, lead maintainer of the independent non-profit bitsandbytes (sponsored by Hugging Face), works on co-engineering the democratization of AI and in his free time cherishes electronic music, queer culture and ski mountaineering. With degrees in Psychology and Computer Science... Read More →
avatar for Angela Yi

Angela Yi

Software Engineer, Meta
I've been working on the PyTorch Compilers team for the past 2 years, mainly working on torch.export!
avatar for Animesh Jain

Animesh Jain

Software Engineer, Meta
Animesh Jain works on PyTorch compilers.
avatar for Antoni Viros i Martin

Antoni Viros i Martin

Research Scientist, IBM Research
Antoni is currently a Research Scientist at IBM Research, investigating optimization approaches for ML inference and training, with a focus on open-source technologies such as PyTorch. He holds a PhD in Aerospace Engineering from Texas A&M University, and has previously worked at... Read More →
avatar for Bin Bao

Bin Bao

Software Engineer, Meta
Bin Bao is a software engineer working with the PyTorch Compiler team at Meta. He focuses on developing AOTInductor, an Ahead-of-Time compiler for the PyTorch2 export path.
avatar for Daniel Haziza

Daniel Haziza

Research Engineer, Meta AI
Daniel is a Research Engineer working at FAIR Paris on workloads efficiency, and developing the xFormers library
avatar for Diogo Venâncio

Diogo Venâncio

Researcher, University of Lisbon | INESC-ID
My name is Diogo and I am a Master's student at IST in Lisbon, Portugal and also a ML Engineer at an early stage AI startup. I grew up in the suburbs of Lisbon and always strived to have a positive impact on the lives of others. At the age of 20, I built my own company, called OutGoing... Read More →
avatar for Eikan Wang

Eikan Wang

AI Frameworks Engineer, Intel
Eikan is a staff engineer from Intel and a DL framework tech lead having full-stack experience in DL, from various AI applications to framework, library, and DL compiler. He is actively optimizing on torch.compile stack for Intel platforms, including optimizing Inductor C++/OpenMP... Read More →
avatar for Emma Ning

Emma Ning

Principal PM, Microsoft
Emma Ning is a Principal PM in the Microsoft AI Framework team, focusing on AI model operationalization and acceleration with ONNX Runtime/Olive for open and interoperable AI. She has more than five years of product experience in search engines taking advantage of machine learning... Read More →
avatar for Iris Zhang

Iris Zhang

Software Engineer, Meta
PyTorch Distributed @ Meta
avatar for Junghwan Park

Junghwan Park

Lead maintainer @ PyTorch Korea User Group, PyTorch Korea User Group
- Data engineer at telecommunication company in Korea - Lead maintainer at PyTorch Korea User Group - Interested in open-source, community and time-series forecasting
avatar for Kshama Pawar

Kshama Pawar

Principal Program Manager, Microsoft Corporation
Kshama Pawar is a Program Manager on the AI Platform team at Microsoft. She helps drive Training initiatives for both large language models and on-device training through optimization engines like ONNX Runtime. She is also involved in the Triton community effort to improve developer... Read More →
avatar for Laikh Tewari

Laikh Tewari

Deep Learning Software Product Manager, NVIDIA
Laikh Tewari manages products for inference in deep learning frameworks at NVIDIA and focuses on the usability of performance optimization tools across data center, consumer, and embedded segments. Laikh received his B.S. and M.S. in computer science from Stanford University where... Read More →
avatar for Mingfei Ma

Mingfei Ma

Senior Software Engineer, Intel
Mingfei Ma is a senior deep learning software engineer in Intel. He is also the maintainer of CPU performance module in PyTorch. Mingfei holds a Master degree from Harbin Institute of Technology where he majored in Control Science and Technology. Mingfei has a 12 years’ experience... Read More →
CH

Chien-Chin Huang

Software Engineer, Meta
Software Engineer, PyTorch Distributed, Meta
avatar for Mikayla Gawarecki

Mikayla Gawarecki

Software Engineer, Meta Platforms
Software Engineer at Meta on PyTorch Core Team
avatar for Baihan Huang

Baihan Huang

Software Engineer, Meta
Working on PyTorch
avatar for KaiChao YOU

KaiChao YOU

Ph.D. student, Tsinghua University/UC Berkeley
Kaichao You is a four-th year Ph.D. student from Tsinghua University. He is currently visiting UC Berkeley, working on the vLLM project, a high-throughput and memory-efficient inference and serving engine for LLMs. He is an open-source contributor to PyTorch/Triton, and he leads the... Read More →
avatar for Brian Hirsh

Brian Hirsh

Software Engineer, Meta
Brian is a software engineer at Meta working on PyTorch core and compilers.
avatar for Jesse Cai

Jesse Cai

Software Engineer, Meta
Jesse is a software engineer on the PyTorch Core Performance team, where he works on accelerating models with sparsity. Before joining Meta, he worked at several startups, focusing on natural language processing.
avatar for Pauline Sho

Pauline Sho

Software Engineer, Google
Software engineering at Google LLC currently focused on improving the quantization infrastructure for edge devices.
AS

Alex Sin

AI Software Solutions Engineer, Intel
avatar for Louie Tsai

Louie Tsai

AI SW Engineer, Intel
avatar for Horace He

Horace He

Software Engineer, Meta
To be filled
avatar for Adnan Hoque

Adnan Hoque

Research Engineer, IBM
I am a Research Engineer at IBM. I have a Bachelor of Science degree in Electrical Engineering from the University of Alberta. I have worked on machine learning applications in various domains such as computer vision, network security and most recently have been developing kernels... Read More →
avatar for Blair Birdsell

Blair Birdsell

Data Scientist, Surespan Construction
Blair Birdsell has a MASc in Civil Engineering from the University of Victoria. This background integrates his design and engineering expertise with data science. Over 9 years, Blair has contributed to 4.86 million sq. ft. of building projects and now develops data-driven software... Read More →
avatar for Chih-Chieh Yang

Chih-Chieh Yang

Research Scientist, IBM
Performance optimization of AI workloads
avatar for Chirag Pandya

Chirag Pandya

Software Engineer, Meta
Chirag is backend engineer who's worked for over 20 years in the Software industry. His expertise includes Networks/Storage/Security and Distributed Systems with emphasis on building fast, secure and performant systems.
avatar for Hansong Zhang

Hansong Zhang

Software Engineer, Meta Platforms
Software Engineer at Meta. Worked on integrating ExecuTorch framework into Android apps with Java and JNI library.
avatar for Hiroshi Inoue

Hiroshi Inoue

Research Staff Member, IBM Research - Tokyo
Hiroshi Inoue is a research staff member at IBM Research - Tokyo, where he works on performance optimization of system software. He has a PhD from the University of Tokyo.
avatar for Huamin Li

Huamin Li

Software Engineer, Meta
Software engineer from Meta PyTorch, focusing on GPU and CPU inference for Meta internal workloads
avatar for Hyoyoung Chang

Hyoyoung Chang

Lead maintainer, PyTorch Korea User Group
Data Engineer
avatar for Jack Clayton

Jack Clayton

AI Developer Advocate, Modular
Jack started his career optimizing autonomous truck software for leading mining companies, including BHP and Caterpillar. Most recently he was designing computer vision software, putting AI inference pipelines into production for IDVerse. He is passionate about the developer community... Read More →
avatar for Joel Schlosser

Joel Schlosser

Software Engineer, Meta
Engineer with a decade's worth of ML experience across the research, industry, and framework perspectives.
avatar for Joshua Lochner

Joshua Lochner

Machine Learning Engineer, Hugging Face
Bringing the power of machine learning to the web. Currently working on Transformers.js (@huggingface 🤗)
avatar for Less Wright

Less Wright

PyTorch Partner Engineer, Meta
PyTorch Distributed and Cuda/Triton kernels
avatar for Lucas Pasqualin

Lucas Pasqualin

ML Engineer, PyTorch (Meta)
Lucas has been developing Machine Learning Applications and Machine Learning infrastructure at scale for years, and has recently been focused on extending the product offering of PyTorch's Distributed Checkpointing stack.
avatar for Manfei Bai

Manfei Bai

Software Engineer, Google LLC
Manfei Bai is a software engineer at Google.
avatar for Michał Szołucha

Michał Szołucha

Deep Learning Software Engineer, NVIDIA
During his work at NVIDIA, Michał gained vast experience in Deep Learning Software Development. He tackled challenges in training and inference, ranging from small-scale to large-scale applications, as well as user-facing tasks and highly-optimized benchmarks like MLPerf. Micha... Read More →
avatar for Na Li

Na Li

Software Engineer, Google
Tech Lead Manager at Google Cloud, leading on-device ML developer tools.
avatar for Nick Kreeger

Nick Kreeger

Frameworks Engineering Director, Modular
Software Engineering lead with over 15 years of experience working at Google, Microsoft and a handful of startups. Nick has contributed to many technologies in Machine Learning such as TensorFlow.js, TensorFlow Lite/Micro, and ONNX/ONNXRuntime. Nick enjoys spending his free time with... Read More →
avatar for Oguz Ulgen

Oguz Ulgen

Software Engineer, Meta
I'm a software engineer at Meta where I used to work on the Hack programming language and now work on PyTorch.
avatar for Rostan TABET

Rostan TABET

Software Engineer, NVIDIA
I am a Computer Science student with a passion for Python and deep learning. During my end-of-studies internship, I focused on leveraging free-threaded Python in the context of NVIDIA's deep learning libraries suite. My work aims to improve data handling efficiency in machine learning... Read More →
avatar for Roy Allela

Roy Allela

Sr AI/ML Specialist Architect, AWS
Roy Allela is a Senior AI/ML Specialist Architect at AWS.Roy helps customers-from small startups to large enterprises-train and deploy large language models efficiently on AWS. He previously spent 8 years at Intel as a Senior AI Software Engineer working on low-level ML framework... Read More →
avatar for Sheng Qin

Sheng Qin

Software Engineer, Meta Inc.
Sheng Qin is a software engineer of PyTorch Accelerator Enablement org at Meta
avatar for Sijia Chen

Sijia Chen

Software Engineer, Meta / PyTorch
Sijia is a software engineer in Meta PyTorch Acceleration team, focusing on GPU inference area
avatar for Tianyu Liu

Tianyu Liu

Research Scientist, Meta
Tianyu Liu is a Research Scientist on the PyTorch team at Meta, currently working on distributed training. Prior to this, he was a postdoc at Stanford University and has worked on the Ads Core Machine Learning team at Meta. He obtained his PhD degree at the University of Wisconsin--Madison... Read More →
avatar for Tristan Rice

Tristan Rice

Software Engineer, Meta
Software engineer working on PyTorch Distributed and large scale training.
avatar for Wei Sun

Wei Sun

Research Scientist, Meta Platform
Wei Sun supports the Meta AI Infrastructure organization. He brings deep expertise in analyzing ML model execution during training and serving and identifies efficiency/performance bottlenecks across model and system architecture. This has led him to build some of the most comprehensive... Read More →
avatar for William Wen

William Wen

Software Engineer, Meta Platforms, Inc.
William works on the torch.compile team, specializing in TorchDynamo.
avatar for Yu Chin Fabian Lim

Yu Chin Fabian Lim

Research Staff Member, IBM Research, Singapore
Fabian Lim is currently in IBM Research, Singapore. During 2013 - 2016, he worked in Avago Technologies (now Broadcom), then SK Hynix Memory Systems, in San Jose, CA. From 2010-2013, he was a postdoc at the Massachusetts Institute of Technology, Cambridge, MA. Dr Lim received the... Read More →
TA

Tabari Alexander

STSM, IBM Z AI and Analytics, IBM
avatar for Eric Yang

Eric Yang

Software Engineer, Google
avatar for Sreen Tallam

Sreen Tallam

Software Engineering Manager - AI Performance & Efficiency, Meta
I am a SW Engineering Manager at Meta helping all ML Training & Serving models (RecSys, Content Understanding, GenAI) run optimally and efficiently through various optimization techniques, including scaling them across the entire Meta fleet.
avatar for Qidong Zhao

Qidong Zhao

PHD Student, North Carolina State University
Research Interest:Profiling techniques for different workloads and architectures.
Wednesday September 18, 2024 5:30pm - 8:30pm PDT
Gateway Pavilion - Sponsor Showcase
  Poster Presentations
 
Thursday, September 19
 

10:50am PDT

Lightning Talk: d-Matrix LLM Compression Flow Based on Torch.Fx: Simplifying PTQ/QAT - Zifei Xu & Tristan Webb, d-Matrix Corporation
Thursday September 19, 2024 10:50am - 11:00am PDT
We introduce dmx-compressor, d-Matrix's open-source LLM compression toolkit that is modular, robust, efficient, and user-friendly. It utilizes symbolic tracing and fx.Transformer for network compression while keeping the model a first-class citizen in PyTorch for the user, despite prevalent graph dynamism in LLMs. It achieves this by maintaining both the original nn.Module and a just-in-time (JIT) traced and transformed fx.GraphModule representation behind the scenes, in conjunction with an abstraction that cleanly decouples network compression from the original model graph definition. This design allows the FXIR to dynamically adapt to diverse forward call signatures and flow-control arguments throughout quantization-aware training and post-training quantization written in plain PyTorch, yielding a compressed FXIR fully compatible with application-level APIs like the Hugging Face pipeline. We also provide a graph visualizer based on fx.Interpreter for ease of debugging. We believe this project shall empower the community to build efficient LLMs for deployment on custom hardware accelerators and contribute to the PyTorch ecosystem.
Speakers
avatar for Zifei Xu

Zifei Xu

Senior Machine Learning Research Engineer, d-Matrix Corporation
Zifei is a Senior Machine Learning Research Engineer at d-Matrix. Her current work focuses on developing model quantization pipelines and efficient quantization algorithms. She graduated from Stanford University with a Master's degree in Computational & Mathematical Engineering and... Read More →
avatar for Tristan Webb

Tristan Webb

ML Engineer, d-Matrix
Tristan's background is primarily in computer science and mathematics, and which let him to graduate with a PhD in Complexity Science at the University of Warwick, where he worked with large computational neuroscience models of spiking neural networks using simulators written in C... Read More →
Thursday September 19, 2024 10:50am - 11:00am PDT
Festival Pavilion - Breakout Room A
  Lightning Talks

11:05am PDT

Lightning Talk: LLMs on Edge with AI Accelerators - Chen Lai, Kimish Patel & Cemal Bilgin, Meta
Thursday September 19, 2024 11:05am - 11:15am PDT
LLMs are known to be compute heavy and consume lots of resources (almost all resources on phones), including memory and power. A natural thought is to leverage the AI hardware accelerators, for example, Apple Neural Engine (ANE) on Apple devices and HTP on Qualcomm SoCs, to make it run fast and efficiently. Only by optimizing the model latency, memory consumption and power usage to a certain level will users be interested in installing the models on their devices. In this session, we’d like to introduce how we leverage these AI accelerators within the PyTorch ecosystem to achieve the state-of-art performance for llama3 on device, via ExecuTorch and the partnership with Apple and Qualcomm. Hardware companies usually have their own AI accelerators. Likely they have different characteristics, one may support a list of different operators than others, and one may only support static shapes (like HTP). However, transformers-based optimization can be generic. We’ll discuss in more detail how we apply the generic optimization as well as the backend specific optimization. The techniques we applied here are not just for LLMs, but can be applied to other transformer-based models.
Speakers
avatar for Kimish Patel

Kimish Patel

Software Engineer, Meta Platforms
Kimish has worked on enabling PyTorch on Meta's family of apps, primarily focusing on performance optimizations. His past experiences include hardware/software co-design, CPU architecture, and CPU/GPU performance optimization.
avatar for Chen Lai

Chen Lai

Software Engineer, Meta
Software engineers focusing on bringing up accelerators on devices
avatar for CEMAL Bilgin

CEMAL Bilgin

Engineering Manager, Meta
Engineering Manager PyTorch Edge Acceleration
Thursday September 19, 2024 11:05am - 11:15am PDT
Festival Pavilion - Breakout Room A
  Lightning Talks

11:20am PDT

Lightning Talk: Building and Supporting the Chinese PyTorch Community: Resources, Tutorials, and Engagement - Zong Zesheng, Huawei
Thursday September 19, 2024 11:20am - 11:30am PDT
Description: This proposal aims to provide a comprehensive introduction to the Chinese PyTorch community, we hope to inspire more users to join and contribute, fostering a vibrant and inclusive environment for PyTorch enthusiasts in China. Chinese PyTorch Homepage Introduction to the official Chinese version of the PyTorch website, highlighting its features. Navigation tips and key sections, such as documentation, tutorials, and community events. Improve the connection of users from China with PyTorch Community. Localized Tutorials and Documentation The 2.x version not have Translated version, it hard to catch up with latest features of PyTorch if the beginner not good at English. We translated official documents and tutorials, covering everything from basic PyTorch concepts to advanced applications. Interactive tutorials No interactive tutorials(Like Google Colab) for Chinese students or beginners before, they have to setup environment before start with PyTorch, which might be hard for beginners. And now, an online notebook & tutorials are available to practice or tuning steps for beginners.
Speakers
avatar for zong zesheng

zong zesheng

Software Engineer, Huawei
Currently, trying to let Chinese users to have easier access to PyTorch resources and make a friendly user experiences for beginners.
Thursday September 19, 2024 11:20am - 11:30am PDT
Gateway Pavilion - Cowell Theater
  Lightning Talks

11:20am PDT

Sponsored Session: Torchchat: A Showcase of PyTorch LLM Ubiquity - Jack Khuu & Jesse White, Meta
Thursday September 19, 2024 11:20am - 11:45am PDT
This talk explores the journey of enabling LLMs in the PyTorch ecosystem, as well as how the teams behind AOT Inductor, ExecuTorch, and torchao collaborated to create torchchat, a showcase of PyTorch’s ability to run LLM inference everywhere.

Torchchat demonstrates the ubiquity, simplicity, and quality of PyTorch’s LLM support through performant, reproducible implementations for not only Python environments, but on desktop, server, and on-device as-well.

All of our work is open source and available on GitHub.
Speakers
avatar for Jack Khuu

Jack Khuu

Software Engineer, Meta
Software Engineer @ Meta working on the PyTorch Edge team. TL for torchchat, which is PyTorch's showcase of LLM inference ubiquity (Python, Desktops, Mobile, etc.). More broadly, I focus on the "Experience" of PyTorch Edge, encompassing User, Developer, and Community Experience.Ex-Lecturer... Read More →
avatar for Jesse White

Jesse White

Software Engineering Manager, Meta
Jesse is an engineering manager at PyTorch @ Meta, where he supports the Edge Experience team in improving the experience for on-device inference and training, including mobile, laptops, and embedded devices. With nearly 20 years of experience in startups, Jesse is passionate about... Read More →
Thursday September 19, 2024 11:20am - 11:45am PDT
Festival Pavilion - Breakout Room A
  Breakout Sessions

12:00pm PDT

Lightning Talk: Optimized PyTorch Inference on aarch64 Linux CPUs - Sunita Nadampalli, Amazon (AWS)
Thursday September 19, 2024 12:00pm - 12:10pm PDT
In the last 2 years we've optimized performance of PyTorch on Arm processors. The optimizations have included changes to ATen, C10, MKLDNN operators, GEMM backend, and Torch inductor. In many cases instead of writing our own kernel we integrated the Arm compute library, used fastmath kernels with format types like bf16, implemented operator caching, selected optimal backend based on the input context etc. Through these optimizations we improved performance by over 2x. In this presentation first we will talk about how we went across this process, what those optimizations are, performance numbers for AWS Graviton3 processors for around 75 models, and CI/CD workflow details. Next, we will walk through a sample PyTorch application showing basic usage, how to tune runtime and the resulting speed up. At the end of the presentation attendees will learn about PyTorch performance optimizations on Arm processors, how to use them, and the areas where they can collaborate to further improve PyTorch for aarch64 CPUs.
Speakers
avatar for Sunita Nadampalli

Sunita Nadampalli

Software Development Manager, Amazon/AWS
Sunita Nadampalli is a Software Development Manager at AWS. She leads Graviton software performance optimizations for AI/ML and HPC workloads. She is passionate about open source software development and delivering high-performance and sustainable software solutions with Arm SoCs... Read More →
Thursday September 19, 2024 12:00pm - 12:10pm PDT
Festival Pavilion - Breakout Room B
  Lightning Talks
  • Audience Any
  • Slides Attached Yes

12:10pm PDT

Lightning Talk: Implementing and Using Iterable Datasets: What Could Go Wrong? - Nicolas Hug, Meta
Thursday September 19, 2024 12:10pm - 12:20pm PDT
PyTorch supports two kinds of datasets: Iterable datasets and indexable "map-style" datasets. Iterable datasets can be more flexible and potentially faster than their indexable cousins. They are also much harder to use correctly, and can easily lead to silently wrong results. This talk is a quick and fun intro to some of the traps that Iterable datasets lay out for you, with some tips to help you avoid them.
Speakers
avatar for Nicolas Hug

Nicolas Hug

Research Engineer, Meta
Nicolas is a software engineer in the PyTorch team at Meta, where he mainly contributes to the torchvision library. Prior to that, Nicolas was a research scientist at Columbia University, where he became part of the scikit-learn core development team. Nicolas holds a PhD in machine... Read More →
Thursday September 19, 2024 12:10pm - 12:20pm PDT
Gateway Pavilion - Cowell Theater
  Lightning Talks

4:35pm PDT

Unlocking the Enigma: Crafting Unbiased, Transparent, and Explainable Large Language Models - Rashmi Nagpal, Patchstack
Thursday September 19, 2024 4:35pm - 5:00pm PDT
In an era where artificial intelligence reigns supreme, the statistics are both perplexing and thought-provoking – only a mere 13% of large language models manage to transcend the realms of research and enter the practical world of production. Who bears the responsibility when these models err, spewing out biased or discriminatory outputs? It's time to demystify the complex landscape of machine learning ethics and carve a path towards a brighter, more accountable future! In this talk, firstly, we will navigate the profound impacts of large language models across diverse domains, from the lifesaving advances in medicine to safeguarding our nations through enhanced security protocols. Secondly, as we marvel at data-driven decisions laid by these models, we will confront the darker shadows cast by – the looming spectre of bias in the data. Finally, we will delve deep into the art of building interpretable models and navigating the maze of ethical considerations. Through a live demonstration in PyTorch, we will witness how to craft unbiased, transparent, and explainable models.
Speakers
avatar for Rashmi Nagpal

Rashmi Nagpal

Machine Learning Engineer, Patchstack
Rashmi, a passionate researcher at the MIT CSAIL and machine learning engineer at Patchstack, is dedicated to crafting beautiful AI applications. With nearly 5 years of industrial experience, she has brought ideas to life at pre-seed startups and contributed to impactful redesigns... Read More →
Thursday September 19, 2024 4:35pm - 5:00pm PDT
Festival Pavilion - Breakout Room A
  Breakout Sessions
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Audience
  • Slides Attached
  • Timezone

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.