PyTorch Conference 2024: Full Schedule

September 18-19, 2024
San Francisco, California
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC-7). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

arrow_back View All Dates

8:30am PDT

Registration & Badge Pick-Up

Thursday September 19, 2024 8:30am - 6:00pm PDT

Gateway Pavilion - Foyer

Thursday September 19, 2024 8:30am - 6:00pm PDT
Gateway Pavilion - Foyer

Breaks / Exhibits / Special Events

9:00am PDT

Keynote: Welcome Back & Opening Remarks

Thursday September 19, 2024 9:00am - 9:05am PDT

Festival Pavilion - Keynote Room

Thursday September 19, 2024 9:00am - 9:05am PDT
Festival Pavilion - Keynote Room

Keynote Sessions

9:07am PDT

Keynote: Why You Should Think Twice Before Paying for an Evaluation Tool - Chip Huyen, VP of AI & OSS, Voltron Data

Thursday September 19, 2024 9:07am - 9:22am PDT

Festival Pavilion - Keynote Room

Open-ended evaluation is hard, and the number of evaluation tools has exploded in response to this challenge. However, if tools could solve evaluation, evaluation would have been solved by now. While the right tools can make your life easier, this talk discusses why you should think twice before outsourcing your evaluation to an external tool.

Speakers

Chip Huyen

VP of AI & OSS, Voltron Data

Chip Huyen works to accelerate data analytics on GPUs at Voltron Data. She also advises companies on building AI platforms. Previously, she was with Snorkel AI and NVIDIA, founded an AI infrastructure startup (acquired), and taught Machine Learning Systems Design at Stanford. She’s... Read More →

Thursday September 19, 2024 9:07am - 9:22am PDT
Festival Pavilion - Keynote Room

Keynote Sessions

9:24am PDT

Keynote: Navigating the Architectural Timeline of LLMs - Sebastian Raschka, Staff Research Engineer, Lightning AI

Thursday September 19, 2024 9:24am - 9:39am PDT

Festival Pavilion - Keynote Room

The evolution of large language models (LLMs) from the original Generative Pre-trained Transformer (GPT) series to the recent advancements seen in models like Llama 3 has been accompanied by several architectural and methodological innovations. This talk aims to catch attendees up on the latest AI and LLM development trends, highlighting the key changes and motivations that led to the development of recent state-of-the-art LLMs, such as Llama 3.1.

Specifically, this presentation explores key developments in attention mechanisms, such as sliding window attention, group query, multi-query attention, and FlashAttention, and explains their key motivations and advantages. In addition to exploring the structural changes, this presentation also reviews the recent "tricks of the trade" that have improved the training processes and performance of the latest LLMs. This includes the recent two-step pretraining approach in Llama 3.1 and applying knowledge distillation techniques using real datasets like Gemma 2 and synthetic data, as seen in Llama 3.1.

Moreover, we will also examine the integration of system-level optimizations, such as the Mixture of the Expert method and the hybrid model Samba, which combines Mamba techniques with attention mechanisms and illustrates a broader trend toward more specialized and efficient architectures.

This talk will provide attendees with an understanding of the most notable transformations that have defined the architectural timeline of LLMs.

Speakers

Sebastian Raschka, PhD

Staff Research Engineer, Lightning AI

Sebastian Raschka, PhD, has been working in machine learning and AI for more than a decade. In addition to being a researcher, Sebastian has a strong passion for education. He is known for his bestselling books on machine learning with Python and his contributions to open source.Sebastian... Read More →

Thursday September 19, 2024 9:24am - 9:39am PDT
Festival Pavilion - Keynote Room

Keynote Sessions

9:41am PDT

Keynote: Building an Advanced Knowledge Assistant - Jerry Liu, Co-Founder & CEO, LlamaIndex

Thursday September 19, 2024 9:41am - 9:56am PDT

Festival Pavilion - Keynote Room

A huge promise for LLMs is being able to answer questions and solve tasks of arbitrary complexity over an arbitrary number of data sources. The world has started to shift from simple RAG stacks, which are mostly good for answering pointed questions, to agents that can more autonomously reason over a diverse set of inputs, and interleave retrieval and tool use to produce sophisticated outputs.

Building a reliable multi-agent system is challenging. There's a core question of developer ergonomics and production deployment - what makes sense outside a notebook setting. In this talk we outline some core building blocks for building advanced research assistants, including advanced RAG modules, event-driven workflow orchestration, and more.

Speakers

Jerry Liu

CEO, LlamaIndex

Jerry is the co-founder/CEO of LlamaIndex, the data framework for building LLM applications. Before this, he has spent his career at the intersection of ML, research, and startups. He led the ML monitoring team at Robust Intelligence, did self-driving AI research at Uber ATG and worked... Read More →

Thursday September 19, 2024 9:41am - 9:56am PDT
Festival Pavilion - Keynote Room

Keynote Sessions

9:58am PDT

Keynote: Ray: A Distributed Framework for Heterogeneous Computing - Ion Stoica, Professor, UC Berkeley

Thursday September 19, 2024 9:58am - 10:13am PDT

Festival Pavilion - Keynote Room

Ray has recently become the framework of choice for scaling machine learning workloads—from data preprocessing, to training, fine-tuning, and serving. This talk will highlight Ray’s key features responsible for its flexibility and generality, as well as its recent support for GPUs.

Speakers

Ion Stoica

Professor, UC Berkeley

Ion Stoica is a Professor in the EECS Department at the University of California at Berkeley, and the Director of Sky Computing Lab (https://sky.cs.berkeley.edu/). He is currently doing research on cloud computing and AI systems. Past work includes Ray, Apache Spark, Apache Mesos, Tachyon, Chord DHT, and Dynamic Packet State (DPS). He is an Honorary Member of the Romanian Academy, an ACM Fellow and has received numerous awards, including the Mark Weiser Award (2019... Read More →

Thursday September 19, 2024 9:58am - 10:13am PDT
Festival Pavilion - Keynote Room

Keynote Sessions

10:15am PDT

Keynote: Contributor Awards

Thursday September 19, 2024 10:15am - 10:25am PDT

Festival Pavilion - Keynote Room

Thursday September 19, 2024 10:15am - 10:25am PDT
Festival Pavilion - Keynote Room

Keynote Sessions

10:25am PDT

Coffee Break

Thursday September 19, 2024 10:25am - 10:50am PDT

Gateway Pavilion - Sponsor Showcase

Menus and Maps

Feeling hungry? Explore the food and beverage options available throughout the PyTorch Conference, complete with a map to guide you!

Thursday September 19, 2024 10:25am - 10:50am PDT
Gateway Pavilion - Sponsor Showcase

Breaks / Exhibits / Special Events

10:25am PDT

Sponsor Showcase

Thursday September 19, 2024 10:25am - 8:00pm PDT

Gateway Pavilion - Sponsor Showcase

Thursday September 19, 2024 10:25am - 8:00pm PDT
Gateway Pavilion - Sponsor Showcase

Breaks / Exhibits / Special Events

10:50am PDT

Lightning Talk: d-Matrix LLM Compression Flow Based on Torch.Fx: Simplifying PTQ/QAT - Zifei Xu & Tristan Webb, d-Matrix Corporation

Thursday September 19, 2024 10:50am - 11:00am PDT

Festival Pavilion - Breakout Room A

We introduce dmx-compressor, d-Matrix's open-source LLM compression toolkit that is modular, robust, efficient, and user-friendly. It utilizes symbolic tracing and fx.Transformer for network compression while keeping the model a first-class citizen in PyTorch for the user, despite prevalent graph dynamism in LLMs. It achieves this by maintaining both the original nn.Module and a just-in-time (JIT) traced and transformed fx.GraphModule representation behind the scenes, in conjunction with an abstraction that cleanly decouples network compression from the original model graph definition. This design allows the FXIR to dynamically adapt to diverse forward call signatures and flow-control arguments throughout quantization-aware training and post-training quantization written in plain PyTorch, yielding a compressed FXIR fully compatible with application-level APIs like the Hugging Face pipeline. We also provide a graph visualizer based on fx.Interpreter for ease of debugging. We believe this project shall empower the community to build efficient LLMs for deployment on custom hardware accelerators and contribute to the PyTorch ecosystem.

Speakers

Zifei Xu

Senior Machine Learning Research Engineer, d-Matrix Corporation

Zifei is a Senior Machine Learning Research Engineer at d-Matrix. Her current work focuses on developing model quantization pipelines and efficient quantization algorithms. She graduated from Stanford University with a Master's degree in Computational & Mathematical Engineering and... Read More →

Tristan Webb

ML Engineer, d-Matrix

Tristan's background is primarily in computer science and mathematics, and which let him to graduate with a PhD in Complexity Science at the University of Warwick, where he worked with large computational neuroscience models of spiking neural networks using simulators written in C... Read More →

dmx compressor pdf

dmx Compressor Pytorch Conference pptx

Thursday September 19, 2024 10:50am - 11:00am PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Intermediate
Slides Attached Yes

10:50am PDT

Sponsored Session: Democratizing AI: Powering the Future with Arm’s Global Compute Ecosystem - Gian Marco Iodice, Arm

Thursday September 19, 2024 10:50am - 11:15am PDT

Gateway Pavilion - Cowell Theater

Arm is excited to be at the center of the world's largest compute ecosystem at the dawn of the AI era. A key tenant of our mission is to democratize AI capabilities, empowering millions of developers to put advanced AI features into the hands of billions of users.

In this presentation, we'll explore how Arm is enabling the world’s leading open-source AI frameworks to leverage power-efficient Arm-based computing platforms and Arm architecture features, as a tool for enabling fast and secure AI workloads. The session focuses on how our strategic partnership with the Pytorch and Executorch community is enabling a seamless and transparent developer experience, to run workloads everywhere from cloud to edge. This session will highlight some of our optimized libraries, upstreamed contributions and a wealth of AI-related developer material to build the future of AI on Arm.

Speakers

Gian-Marco Iodice

GenAI Engineering Lead, Arm

Gian Marco Iodice is an experienced edge and mobile computing specialist at Arm for machine learning (ML) and leads engineering development for on-device GenAI. He received the MSc with honors in electronic engineering from the University of Pisa (Italy), where he specialized in HW/SW... Read More →

Thursday September 19, 2024 10:50am - 11:15am PDT
Gateway Pavilion - Cowell Theater

Breakout Sessions

10:50am PDT

The Rise of `Transformers` in the Growing PyTorch Ecosystem - Arthur Zucker, Hugging Face

Thursday September 19, 2024 10:50am - 11:15am PDT

Festival Pavilion - Breakout Room B

Explore how the `tranformers` library grows and adapts to the fast paced and ever-changing AI field to bring the best to the AI community

Speakers

Arthur Zucker

Core Maintainer, Hugging Face

Arthur is a Core maintainer at Hugging Face, maintaining several critical libraries such as transformers and tokenizers. He is the owner of the text and LLM parts of Hugging Face's open-source toolkits, resulting in the implementations of LLaMa, Mistral, MoEs, etc and torch.compile... Read More →

Thursday September 19, 2024 10:50am - 11:15am PDT
Festival Pavilion - Breakout Room B

Breakout Sessions

Audience Intermediate

11:05am PDT

Lightning Talk: LLMs on Edge with AI Accelerators - Chen Lai, Kimish Patel & Cemal Bilgin, Meta

Thursday September 19, 2024 11:05am - 11:15am PDT

Festival Pavilion - Breakout Room A

LLMs are known to be compute heavy and consume lots of resources (almost all resources on phones), including memory and power. A natural thought is to leverage the AI hardware accelerators, for example, Apple Neural Engine (ANE) on Apple devices and HTP on Qualcomm SoCs, to make it run fast and efficiently. Only by optimizing the model latency, memory consumption and power usage to a certain level will users be interested in installing the models on their devices. In this session, we’d like to introduce how we leverage these AI accelerators within the PyTorch ecosystem to achieve the state-of-art performance for llama3 on device, via ExecuTorch and the partnership with Apple and Qualcomm. Hardware companies usually have their own AI accelerators. Likely they have different characteristics, one may support a list of different operators than others, and one may only support static shapes (like HTP). However, transformers-based optimization can be generic. We’ll discuss in more detail how we apply the generic optimization as well as the backend specific optimization. The techniques we applied here are not just for LLMs, but can be applied to other transformer-based models.

Speakers

Kimish Patel

Software Engineer, Meta Platforms

Kimish has worked on enabling PyTorch on Meta's family of apps, primarily focusing on performance optimizations. His past experiences include hardware/software co-design, CPU architecture, and CPU/GPU performance optimization.

Chen Lai

Software Engineer, Meta

Software engineers focusing on bringing up accelerators on devices

CEMAL Bilgin

Engineering Manager, Meta

Engineering Manager PyTorch Edge Acceleration

LLMs on Edge with AI Accelerators pdf

Thursday September 19, 2024 11:05am - 11:15am PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Intermediate
Slides Attached Yes

11:20am PDT

Lightning Talk: Building and Supporting the Chinese PyTorch Community: Resources, Tutorials, and Engagement - Zong Zesheng, Huawei

Thursday September 19, 2024 11:20am - 11:30am PDT

Gateway Pavilion - Cowell Theater

Description: This proposal aims to provide a comprehensive introduction to the Chinese PyTorch community, we hope to inspire more users to join and contribute, fostering a vibrant and inclusive environment for PyTorch enthusiasts in China. Chinese PyTorch Homepage Introduction to the official Chinese version of the PyTorch website, highlighting its features. Navigation tips and key sections, such as documentation, tutorials, and community events. Improve the connection of users from China with PyTorch Community. Localized Tutorials and Documentation The 2.x version not have Translated version, it hard to catch up with latest features of PyTorch if the beginner not good at English. We translated official documents and tutorials, covering everything from basic PyTorch concepts to advanced applications. Interactive tutorials No interactive tutorials(Like Google Colab) for Chinese students or beginners before, they have to setup environment before start with PyTorch, which might be hard for beginners. And now, an online notebook & tutorials are available to practice or tuning steps for beginners.

Speakers

zong zesheng

Software Engineer, Huawei

Currently, trying to let Chinese users to have easier access to PyTorch resources and make a friendly user experiences for beginners.

lightningtalk zezheng zong pdf

Thursday September 19, 2024 11:20am - 11:30am PDT
Gateway Pavilion - Cowell Theater

Lightning Talks

Audience Beginner
Slides Attached Yes

11:20am PDT

Sponsored Session: Torchchat: A Showcase of PyTorch LLM Ubiquity - Jack Khuu & Jesse White, Meta

Thursday September 19, 2024 11:20am - 11:45am PDT

Festival Pavilion - Breakout Room A

This talk explores the journey of enabling LLMs in the PyTorch ecosystem, as well as how the teams behind AOT Inductor, ExecuTorch, and torchao collaborated to create torchchat, a showcase of PyTorch’s ability to run LLM inference everywhere.

Torchchat demonstrates the ubiquity, simplicity, and quality of PyTorch’s LLM support through performant, reproducible implementations for not only Python environments, but on desktop, server, and on-device as-well.

All of our work is open source and available on GitHub.

Speakers

Jack Khuu

Software Engineer, Meta

Software Engineer @ Meta working on the PyTorch Edge team. TL for torchchat, which is PyTorch's showcase of LLM inference ubiquity (Python, Desktops, Mobile, etc.). More broadly, I focus on the "Experience" of PyTorch Edge, encompassing User, Developer, and Community Experience.Ex-Lecturer... Read More →

Jesse White

Software Engineering Manager, Meta

Jesse is an engineering manager at PyTorch @ Meta, where he supports the Edge Experience team in improving the experience for on-device inference and training, including mobile, laptops, and embedded devices. With nearly 20 years of experience in startups, Jesse is passionate about... Read More →

torchchat pytorch conf24 pptx

Thursday September 19, 2024 11:20am - 11:45am PDT
Festival Pavilion - Breakout Room A

Breakout Sessions

Audience Intermediate
Slides Attached Yes

11:20am PDT

Training MoEs at Scale with PyTorch - Mihir Patel & Brian Chu, Databricks

Thursday September 19, 2024 11:20am - 11:45am PDT

Festival Pavilion - Breakout Room B

Mixture-of-Experts MoE (models) are becoming an increasingly popular architecture choice for large language models (LLMs). In this talk, we describe how to train MoE models with PyTorch. After discussing various performance tradeoffs, we use PyTorch distributed tools like DTensor to build custom parallelism approaches, including expert parallelism via MegaBlocks. We then show how to get near linear scaling to thousands of GPUs, combining PyTorch FSDP and HSDP with our parallelism strategies. We discuss many of the challenges of training at scale, including communication bottlenecks, hardware failures, and networking challenges. We further improve training at scale setups using tools like PyTorch Distributed Checkpointing for rapid saving and loading. We then highlight further optimizations to minimize challenges only present at scale, such as object store failures for large checkpoints.

Speakers

Mihir Patel

Research Engineer, Databricks

Mihir Patel is a Research Engineer at MosaicML / Databricks, where he works on distributed training at scale and serves as the tech lead for Composer, an open-source deep learning training library. His primary focus is on large model training, and he has helped build several open... Read More →

Brian Chu

Research Engineer, Databricks

Brian is a Research Engineer at MosaicML / Databricks, where he contributes to Composer and Foundry, open-source libraries for training LLMs. He has been involved in the DBRX project and products like the Databricks finetuning and pretraining API. Prior to joining Databricks, Brian... Read More →

[PyTorch Conference] Training MoEs at Scale with PyTorch pdf

Thursday September 19, 2024 11:20am - 11:45am PDT
Festival Pavilion - Breakout Room B

Breakout Sessions

Audience Intermediate

11:35am PDT

Lightning Talk: Distributing a Million Open Models in the Wild: Lessons Learned from the Hugging Face Hub - Omar Sanseviero, Hugging Face

Thursday September 19, 2024 11:35am - 11:45am PDT

Gateway Pavilion - Cowell Theater

The Hugging Face Hub has over 300,000 PyTorch models. Distributing such number of models poses challenges. In this talk, Omar will share how the community has tackled these challenges, including techniques to ensure torch model security and tooling for researchers to share their models. He'll also take attendees on a journey through the evolution of torch models distributed by the community, highlighting new trends and directions. Attending this talk will give attendees practical insights into the latest developments in model distribution and ecosystem trends.

Speakers

Omar Sanseviero

Chief Llama Officer - Head of Platform and Community, Hugging Face

Omar Sanseviero is the Chief Llama Officer and Head of Platform and Community at Hugging Face, where he works at the intersection of open source, community, and product. Omar leads multiple ML teams that work on topics such as Mobile ML, ML for art, and ML Partnerships. Previously... Read More →

Distributing a Million Open Models in the Wild Lessons Learned from the Hugging Face Hub pdf

Thursday September 19, 2024 11:35am - 11:45am PDT
Gateway Pavilion - Cowell Theater

Lightning Talks

Audience Beginner

11:50am PDT

Lightning Talk: Empowering Developers: Tools and Resources for Running Generative AI on Arm CPUs - Pareena Verma, Arm

Thursday September 19, 2024 11:50am - 12:00pm PDT

Festival Pavilion - Breakout Room B

As the demand for accessible and scalable AI solutions grows, leveraging CPUs for generative AI offers significant advantages in cost, energy efficiency and widespread availability. This sessions aims to equip developers with the ecosystem of tools, resources and technical content needed to effectively run generative AI use cases on Arm CPUs. We have launched a range of easily digestible tutorials for developers, part of our Learning Paths on https://learn.arm.com/, which demonstrate how you can easily and efficiently run small and large language models on Arm-based devices. Learn about end-to-end workflows to accelerate PyTorch based sentiment analysis models from Hugging Face on Arm servers with optimizations in Arm Compute Library kernels for fp32 and bfloat16. Use the new KleidiAI library to accelerate LLMs with AI frameworks and build an Android chat app on your Arm mobile device with ExecuTorch, and XNNPACK. Find out about our roadmap for learning content demonstrating the feasibility and successful deployment of generative AI on Arm-based devices. Help us shape the support that we offer developers.

Speakers

Pareena Verma

Principal Solutions Architect, Arm

Pareena is a Principal Solutions Architect at Arm. She has extensive experience working with software developers and SoC architects on numerous Arm based projects involving usage of modeling, ML frameworks, compilers, debuggers and virtual prototyping simulation tools. Pareena holds... Read More →

Thursday September 19, 2024 11:50am - 12:00pm PDT
Festival Pavilion - Breakout Room B

Lightning Talks

Audience Intermediate

11:50am PDT

Lightning Talk: New Activation Checkpointing APIs in PyTorch - Jeffrey Wan & Horace He, Meta

Thursday September 19, 2024 11:50am - 12:00pm PDT

Festival Pavilion - Breakout Room A

Activation checkpointing is a commonly used technique to reduce memory usage during model training by reducing the number of activations saved for backward. Instead of keeping tensors needed for backward alive until they are used in gradient computation during backward, those tensors are recomputed during the backward pass. This talk will introduce new activation checkpoint APIs that can help achieve a better trade off between memory savings and compute overhead that recomputing introduces.

Speakers

Horace He

Software Engineer, Meta

To be filled

Jeffrey Wan

Software Engineer, Meta

Software Engineer working on PyTorch

New Activation Checkpointing APIs in PyTorch pdf

Thursday September 19, 2024 11:50am - 12:00pm PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Intermediate

11:50am PDT

Lightning Talk: Understanding and Optimizing PyTorch Models with Thunder - Luca Antiga, Lightning AI

Thursday September 19, 2024 11:50am - 12:00pm PDT

Gateway Pavilion - Cowell Theater

A hallmark feature of PyTorch is the natural expression of computation. This enables practitioners to implement AI models with ease. However, it prompts the question how to optimize the workload for a given hardware setup because those optimizations clutter our code and are tricky to combine. Lightning Thunder provides a Python-to-Python compiler to scale and optimize PyTorch programs that focuses on usability, understandability, and extensibility. A key tool in delivering on these goals is the composability of transformations: without changing the user code, we can stack quantization, distributing the computation across multiple GPUs, dispatching to optimized kernels, offloading, and other pluggable optimizations. Lightning Thunder flourishes in the PyTorch ecosystem: with PyTorch eager and with executors like torch.compile and nvFuser. It also dispatches to libraries like cuDNN, TransformerEngine, Apex, OpenAI Triton. The ability to apply multiple optimizations just-in-time leads to significant compounded speed-ups over unoptimized code out of the box. Luca will discuss the design of Thunder and demonstrate applications on training and inference for large language and multimodal models.

Speakers

Luca Antiga

CTO, Lightning AI

CTO @ Lightning AI, Founder (Orobix, Tensorwerk), early PyTorch core contributor, Manning Author (Deep Learning with PyTorch). PhD in Bioengineering.

Thursday September 19, 2024 11:50am - 12:00pm PDT
Gateway Pavilion - Cowell Theater

Lightning Talks

Audience Any

12:00pm PDT

Lightning Talk: Fast, Scalable Distributed Training with StreamingDataset - Saaketh Narayan, Databricks

Thursday September 19, 2024 12:00pm - 12:10pm PDT

Gateway Pavilion - Cowell Theater

StreamingDataset makes training on large datasets from cloud storage as fast, cheap, and scalable as possible. It’s specially designed for multi-node, distributed training for large models — maximizing correctness guarantees, performance, and ease of use. Key features include elastically deterministic training, instant mid-epoch resumption, effective shuffling, high training throughput, and flexible data mixing, among other features. When training with StreamingDataset, the data shards are written to cloud storage in MDS, our file format that allows for low-latency random access to samples. By being as efficient as possible with shard downloads and shuffling, StreamingDataset minimizes egress costs while ensuring that dataloading never bottlenecks model training. StreamingDataset powers training for LLMs with over 100 billion parameters like DBRX, to advanced diffusion models, to two-tower recommendation models, and more, scaling to training jobs on thousands of GPUs with ease. Join us to learn how StreamingDataset can elevate your distributed model training experience.

Speakers

Saaketh Narayan

Machine Learning Engineer, Databricks

Saaketh Narayan is a machine learning engineer at Databricks. As part of the Mosaic AI Runtime team, he works on the GenAI training stack, including dataloading, training frameworks, and performance across the Mosaic Streaming, Composer, and LLM Foundry libraries.

StreamingDataset PyTorchCon Presentation (1) pdf

Thursday September 19, 2024 12:00pm - 12:10pm PDT
Gateway Pavilion - Cowell Theater

Lightning Talks

Audience Intermediate

12:00pm PDT

Lightning Talk: FlexAttention - The Flexibility of PyTorch + The Performance of FlashAttention - Yanbo Liang & Horace He, Meta

Thursday September 19, 2024 12:00pm - 12:10pm PDT

Festival Pavilion - Breakout Room A

Introducing a novel abstraction leveraging the PyTorch compiler stack to enable custom, user-defined attention mechanisms. This new API supports dynamic modifications to attention scores within SDPA, providing both runtime and memory efficiency through kernel fusion with the FlashAttention algorithm.

Speakers

Yanbo Liang

software engineer, Meta

I'm software engineer at PyTorch team working on torch.compile and LLM.

Horace He

Software Engineer, Meta

To be filled

Thursday September 19, 2024 12:00pm - 12:10pm PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Intermediate

12:00pm PDT

Lightning Talk: Optimized PyTorch Inference on aarch64 Linux CPUs - Sunita Nadampalli, Amazon (AWS)

Thursday September 19, 2024 12:00pm - 12:10pm PDT

Festival Pavilion - Breakout Room B

In the last 2 years we've optimized performance of PyTorch on Arm processors. The optimizations have included changes to ATen, C10, MKLDNN operators, GEMM backend, and Torch inductor. In many cases instead of writing our own kernel we integrated the Arm compute library, used fastmath kernels with format types like bf16, implemented operator caching, selected optimal backend based on the input context etc. Through these optimizations we improved performance by over 2x. In this presentation first we will talk about how we went across this process, what those optimizations are, performance numbers for AWS Graviton3 processors for around 75 models, and CI/CD workflow details. Next, we will walk through a sample PyTorch application showing basic usage, how to tune runtime and the resulting speed up. At the end of the presentation attendees will learn about PyTorch performance optimizations on Arm processors, how to use them, and the areas where they can collaborate to further improve PyTorch for aarch64 CPUs.

Speakers

Sunita Nadampalli

Software Development Manager, Amazon/AWS

Sunita Nadampalli is a Software Development Manager at AWS. She leads Graviton software performance optimizations for AI/ML and HPC workloads. She is passionate about open source software development and delivering high-performance and sustainable software solutions with Arm SoCs... Read More →

pytorch conf24 aarch64 Linux sunita nadampalli pdf

Thursday September 19, 2024 12:00pm - 12:10pm PDT
Festival Pavilion - Breakout Room B

Lightning Talks

Audience Any
Slides Attached Yes

12:10pm PDT

Lightning Talk: AOTriton: Ahead of Time Triton Kernel Libraries on ROCm - Jeff Daily, AMD

Thursday September 19, 2024 12:10pm - 12:20pm PDT

Festival Pavilion - Breakout Room B

Scaled dot product attention provides significant acceleration of the transformer layer through fusion of the multihead attention layer. There are several different algorithms to achieve this but tiled attention through scaled dot product attention via Flash Attention is a very popular approach. In PyTorch on the ROCm platform this is currently achieved through ahead of time compiled (AOT) Triton kernels in a linkable archive. AMD’s work to enable and package these kernels is done through AOTriton, which aims to use Triton’s compiler and GPU kernels for faster development. AOTriton maintains an optimized set of tiling sizes and other parameters to provide optimized, pre-compiled Triton kernels. The differences between JIT and AOT are few but are very important. Despite this, prototyping kernels in Triton is much faster than template-based C++ libraries. In this presentation we will go into detail on the interaction layer between PyTorch and AOTriton, the structure of AOTriton and how to add new triton kernels to AOTriton.

Speakers

Jeff Daily

Principal Member of Technical Staff, Advanced Micro Devices

Jeff Daily is the chief architect of the Machine Learning Software Engineering group supporting ML frameworks such as PyTorch and onnxruntime on AMD GPUs. He enjoys delivering open source software to answer the challenges of the rapidly-changing ML landscape. For over five years... Read More →

AOT PyConf Jeff 20240912 pdf

Thursday September 19, 2024 12:10pm - 12:20pm PDT
Festival Pavilion - Breakout Room B

Lightning Talks

Audience Intermediate

12:10pm PDT

Lightning Talk: Implementing and Using Iterable Datasets: What Could Go Wrong? - Nicolas Hug, Meta

Thursday September 19, 2024 12:10pm - 12:20pm PDT

Gateway Pavilion - Cowell Theater

PyTorch supports two kinds of datasets: Iterable datasets and indexable "map-style" datasets. Iterable datasets can be more flexible and potentially faster than their indexable cousins. They are also much harder to use correctly, and can easily lead to silently wrong results. This talk is a quick and fun intro to some of the traps that Iterable datasets lay out for you, with some tips to help you avoid them.

Speakers

Nicolas Hug

Research Engineer, Meta

Nicolas is a software engineer in the PyTorch team at Meta, where he mainly contributes to the torchvision library. Prior to that, Nicolas was a research scientist at Columbia University, where he became part of the scikit-learn core development team. Nicolas holds a PhD in machine... Read More →

What could go wrong pdf

Thursday September 19, 2024 12:10pm - 12:20pm PDT
Gateway Pavilion - Cowell Theater

Lightning Talks

Audience Intermediate
Slides Attached Yes

12:10pm PDT

Lightning Talk: Making the Most of Heterogeneous Memory Capacity Using PyTorch - Syed Ahmed, NVIDIA Corporation

Thursday September 19, 2024 12:10pm - 12:20pm PDT

Festival Pavilion - Breakout Room A

Memory intensive deep learning workloads require efficient use of all kinds of memories that are available in a system. In this session, we will discuss how we can utilize such heterogeneous memory through memory pools in PyTorch. We will show how to mix-and-match different CUDA system allocators in the same PyTorch program using memory pools. Consequently, this API unlocks new use cases such as Extended GPU Memory (EGM) based all-gathers, Unified Virtual Memory (UVM), and NVLink Sharp (NVLS) reductions. New NVIDIA architectures accelerate such use cases with high-bandwidth and low-latency interconnects in the hardware, driven by extended functionality of CUDA system allocators in the software. Learn how to use these techniques on memory-intensive deep learning models like LLMs, and discover new CUDA features powered by PyTorch.

Speakers

Syed Ahmed

Senior Software Engineer, NVIDIA

Syed Ahmed is a Senior Software Engineer on the PyTorch Core team at NVIDIA, focused on keeping PyTorch fast and numerically stable on current NVIDIA platforms, and making PyTorch more expressive on future NVIDIA platforms. He holds a Master’s degree in Electrical Engineering from... Read More →

pytorch conference 2024 mempools pdf

Thursday September 19, 2024 12:10pm - 12:20pm PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Intermediate

12:25pm PDT

Lunch (Provided Onsite for All Attendees)

Thursday September 19, 2024 12:25pm - 1:25pm PDT

Gateway Pavilion - Sponsor Showcase

Menus and Maps

Feeling hungry? Explore the food and beverage options available throughout the PyTorch Conference, complete with a map to guide you!

Thursday September 19, 2024 12:25pm - 1:25pm PDT
Gateway Pavilion - Sponsor Showcase

Breaks / Exhibits / Special Events

1:25pm PDT

Sponsored Keynote: Accelerating AI: How AMD and PyTorch Drive Innovation with Seamless Day-0 Support and High Performance - Anush Elangovan, CVP Software Development, AMD

Thursday September 19, 2024 1:25pm - 1:30pm PDT

Festival Pavilion - Keynote Room

In this keynote presentation, we explore the robust collaboration between AMD and PyTorch that is propelling advancements in artificial intelligence and machine learning. Discover how AMD's commitment to Day-0 PyTorch support ensures that PyTorch users benefit from cutting-edge performance enhancements and out-of-the-box compatibility. We delve into the technical synergies that make AMD hardware an ideal choice for PyTorch frameworks, showcasing real-world examples of accelerated workflows and breakthrough AI applications. Join us to learn how this dynamic partnership is enabling researchers, developers, and data scientists to push the boundaries of innovation and achieve unprecedented results in their AI projects.

Speakers

Anush Elangovan

Vice President - AI Software, AMD

Thursday September 19, 2024 1:25pm - 1:30pm PDT
Festival Pavilion - Keynote Room

Keynote Sessions

1:32pm PDT

Sponsored Keynote: Optimizing AI Inference for Large Language Models - Mudhakar Srivatsa, Distinguished Engineer, IBM

Thursday September 19, 2024 1:32pm - 1:37pm PDT

Festival Pavilion - Keynote Room

This talk will cover two new ways IBM has optimized generative AI inferencing with PyTorch: speculative decoding and Triton kernel development. Speculative decoding leverages predictive modeling to reduce latency by anticipating potential outputs, streamlining the inference process without sacrificing accuracy. IBM Research's team developed new speculative architectures and open sourced speculators for LLama3 models. It will also discuss various Triton kernels to accelerate inference, one of which was contributed to vLLM for accelerating MoE models. Finally, it will share a glimpse of IBM's AI hardware work, including how the IBM Artificial Intelligence Unit (AIU) could integrate into the PyTorch stack.

Speakers

Mudhakar Srivatsa

Distinguished Engineer, IBM Research

Mudhakar Srivatsa is a distinguished research staff member at the Distributed Cloud department in IBM T. J. Watson Research Center. His work is focussed on heterogeneous spatiotemporal data with applications to edge computing, AIOps and Hybrid AI Scaling. He is an IBM master inv... Read More →

Thursday September 19, 2024 1:32pm - 1:37pm PDT
Festival Pavilion - Keynote Room

Keynote Sessions

1:40pm PDT

Keynote Panel Discussion: Scaling & Benchmarking - Anastasios Nikolas Angelopoulos, UC Berkeley/LMSYS; Lisa Dunlap, UC Berkeley; James Bradbury, Anthropic; Tri Dao, together.ai; Aparna Ramani & Soumith Chintala, Meta

Thursday September 19, 2024 1:40pm - 2:10pm PDT

Festival Pavilion - Keynote Room

Moderators

Soumith Chintala

VP/Fellow of Meta & Co-Creator of PyTorch

I am an Artificial Intelligence researcher, engineer and community builder.I am currently at Meta, jumping between Engineering, Research and Leadership as I find convenient. I also visit NYU as a part-time researcher.My career interests have been defined by two sets of work: AI Platforms/Ecosystems... Read More →

Speakers

Anastasios Nikolas Angelopoulos

Researcher, UC Berkeley/LMSYS

James Bradbury

Software Engineer, Anthropic

James is Head of Compute at Anthropic, where he is focused on ensuring that the company has the accelerator resources it needs to pursue its mission, and that the resources can be used effectively and efficiently across the organization. He joined in 2023 from Google DeepMind, where... Read More →

Lisa Dunlap

Student, UC Berkeley

PhD student at UC Berkeley working on (1) interpreting and evaluating generative models and (2) automating data science on unstructured data using large multimodal modelsAlso an underwhelming nail enthusiast and reader of old psychiatry books.

Tri Dao

Assistant Professor at Princeton University, Chief Scientist of Together AI, Princeton University, Together AI

Tri Dao is an Assistant Professor at Princeton University and chief scientist of Together AI. He completed his PhD in Computer Science at Stanford, co-advised by Christopher Ré and Stefano Ermon. He works at the intersection of machine learning and systems, and his research highlights... Read More →

Aparna Ramani

VP Engineering, Meta

Aparna is VP Engineering at Meta, responsible for AI Infrastructure, Data Infrastructure and Developer Infrastructure. Over the last eight years at Meta, Aparna has built a world-class team that is responsible for some of the largest scale systems on the planet - to process exabyte-scale... Read More →

Thursday September 19, 2024 1:40pm - 2:10pm PDT
Festival Pavilion - Keynote Room

Keynote Sessions

2:15pm PDT

Building PyTorch Computer Vision Algorithms for 100 Skin Shades - Emmanuel Acheampong, roboMUA

Thursday September 19, 2024 2:15pm - 2:40pm PDT

Gateway Pavilion - Cowell Theater

At roboMUA we're leading the charge in building predictive AI models for diverse skin shades with the use of Convolutional Neural Networks (CNNs), and harnessing the power of Generative Adversarial Networks (GANs) specifically for generating realistic images of black hairstyles. Our session showcases PyTorch's versatility in both predictive and generative tasks, offering a comprehensive approach to inclusive AI. For predictive AI models, we leverage PyTorch's flexible framework to develop CNNs. Through innovative techniques in feature engineering and model architecture design, we demonstrate how PyTorch enables accurate prediction across 100 skin shades. Simultaneously, we showcase the transformative potential of GANs in the realm of black hairstyles. By training GANs on a curated dataset of diverse hair textures and styles, we illustrate how PyTorch facilitates the generation of lifelike images that celebrate the beauty and diversity of black hair. Attendees will gain insights into the data preprocessing, model training, and evaluation processes and and learn how PyTorch empowers developers to build inclusive solutions.

Speakers

Emmanuel Acheampong

CEO / Head of AI, yShade.ai (formerly roboMUA)

Emmanuel Acheampong is a co-founder and CEO of roboMUA - an innovative AI solutions company with a visionary focus on catering to all skin shades and types. He graduated from Notre Dame’s ESTEEM program with a Masters thesis on the intersection of Artificial Intelligence and directed... Read More →

Thursday September 19, 2024 2:15pm - 2:40pm PDT
Gateway Pavilion - Cowell Theater

Breakout Sessions

Audience Beginner

2:15pm PDT

Data-Dependent Shapes in PT2 - Edward Yang, Meta

Thursday September 19, 2024 2:15pm - 2:40pm PDT

Festival Pavilion - Breakout Room A

Data-dependent shapes are ubiquitous whenever you want to take advantage of sparsity in your data representation, whether it is in recommendation systems, mixture of experts or other use cases. We have made a lot of improvements to torch.compile's support for capturing and compiling data dependent shapes, but they also require some user knowledge to work with effectively. This talk will give an overview of PT2's facilities for data dependent compute and how to use them effectively.

Speakers

Edward Z. Yang

Research Engineer, Meta

Edward Yang has worked on PyTorch at Meta since nearly the very beginning. Currently, he works on all aspects of PT2, but with a particular focus on dynamic shapes support across the stack.

Data dependent shapes in PT2 pdf

Thursday September 19, 2024 2:15pm - 2:40pm PDT
Festival Pavilion - Breakout Room A

Breakout Sessions

Audience Advanced

2:15pm PDT

vLLM: Easy, Fast, and Cheap LLM Serving for Everyone - Woosuk Kwon & Xiaoxuan Liu, UC Berkeley

Thursday September 19, 2024 2:15pm - 2:40pm PDT

Festival Pavilion - Breakout Room B

We will present vLLM, an open-source high-performance LLM inference engine built on top of PyTorch. Starting as a research project at UC Berkeley, vLLM has been one of the fastest and most popular LLM inference solutions in industry, reaching 20K+ stars and 350+ contributors. In this talk, we will cover how vLLM adopts various LLM inference optimizations and how it supports various AI accelerators such as AMD GPUs, Google TPUs, and AWS Inferentia. Also, we will discuss how vLLM benefits from PyTorch 2 and its ecosystem.

Speakers

Lily Liu

Student, UCB

Lily (Xiaoxuan) Liu is a PhD student at UC Berkeley, working with Professors Ion Stoica and Alvin Cheung. Her research focuses on machine learning systems, particularly optimizing latency for LLM inference and addressing memory bottlenecks in LLM systems. Her recent work explores... Read More →

Woosuk Kwon

PhD Student, UC Berkeley

Woosuk Kwon is a Ph.D. student at UC Berkeley, advised by Prof. Ion Stoica. He is interested in building practical, flexible, and high-performance software systems for emerging applications such as large language models. Recently, he has been developing vLLM, a high-performance open-source... Read More →

Thursday September 19, 2024 2:15pm - 2:40pm PDT
Festival Pavilion - Breakout Room B

Breakout Sessions

Audience Intermediate

2:45pm PDT

Lightning Talk: What's New for PyTorch Developer Infrastructure - Sahan Paliskara & Catherine Lee, Meta

Thursday September 19, 2024 2:45pm - 2:55pm PDT

Festival Pavilion - Breakout Room A

Having a chat about all of the work being done to continue supporting PyTorch's Developer Infrastructure needs including updates around Target Determination, Releases, and OSS Tooling.

Speakers

Catherine Lee

Software Engineer, META

Software engineer on the PyTorch Dev Infra team primarily working on reducing time to signal, testing infrastructure, and CI related developer tooling.

Sahan Paliskara

Software Engineer, Meta

After spending a lot of time using PyTorch to train computer vision models, Sahan joined the PyTorch team three years ago. He started off working on inference and packaging, and now he's part of the dev infra team. These days, he's involved in everything from managing releases to... Read More →

2024 PyTorch Conference What's New in DevInfra pdf

Thursday September 19, 2024 2:45pm - 2:55pm PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Intermediate

2:45pm PDT

Blobs to Clips: Efficient End-to-End Video Data Loading - Andrew Ho & Ahmad Sharif, Meta

Thursday September 19, 2024 2:45pm - 3:10pm PDT

Gateway Pavilion - Cowell Theater

The PyTorch team has improved training speed by an order of magnitude for teams at Meta working on Small-to-Large-Scale MultiModal Video models. In this talk we’ll share our learnings on reducing GPU starvation by overcoming data loading challenges such as dealing with large distributed datasets, worker imbalance, compute-bottlenecks due to parallel video decoding and sampling, checkpointing, and debuggability. As part of our commitment to open-source, we are releasing a new decoding library and updating existing PyTorch libraries on GitHub, and invite feedback and contributions from the community.

Speakers

Ahmad Sharif

Software Engineer, Meta

SWE in Pytorch Content Domains Past: SWE at Google in Search, Privacy, ChromeOS

Andrew Ho

Machine Learning Engineer, Meta Platforms

We are ML Engineers at Meta on PyTorch working on multi-modal LLM dataloading

Blobs to Clips Efficient End to End Video Data Loading pdf

Thursday September 19, 2024 2:45pm - 3:10pm PDT
Gateway Pavilion - Cowell Theater

Breakout Sessions

Audience Intermediate

2:45pm PDT

Torchtitan: Large-Scale LLM Training Using Native PyTorch 3D Parallelism - Wanchao Liang, Meta & Linsong Chu, IBM Research

Thursday September 19, 2024 2:45pm - 3:10pm PDT

Festival Pavilion - Breakout Room B

torchtitan is a proof-of-concept for Large-scale LLM training using native PyTorch. It is a repo that showcases PyTorch's latest distributed training features in a clean, minimal codebase. We show-cased end to end large scale training features enablement: 1. 3D/4D Parallelism 2. Efficient distributed checkpoint save/load/resharding 3. Many efficient training techniques including Float8, torch.compile, activation checkpoint, etc.

Speakers

Wanchao Liang

Software Engineer, Meta Platforms, Inc.

Software Engineer at Meta, PyTorch team Tech Lead in PyTorch Distributed training. Author of torchtitan, Tensor Parallel and DTensor, a fundamental distributed abstraction to perform distributed computation. Previously worked on the TorchScript compiler, ONNX.

LINSONG CHU

Senior Technical Staff Member, IBM Research

Linsong is a STSM at IBM Research, focusing on FSDP, torch compile and FP8 in the area of pre-training.

torchtitan Large Scale LLM Training Using Native PyTorch 3D Parallelism pdf

Thursday September 19, 2024 2:45pm - 3:10pm PDT
Festival Pavilion - Breakout Room B

Breakout Sessions

Audience Intermediate

3:00pm PDT

Lightning Talk: PyTorch Release Process - Andrey Talman, Meta

Thursday September 19, 2024 3:00pm - 3:10pm PDT

Festival Pavilion - Breakout Room A

I would like to present and quickly discuss PyTorch Release process, how it happens. What are milestones. What is our cherry-picking criteria, how we validate the release.

Speakers

Andrey Talman

Software Engineer, Meta Inc.

Software Engineer - Meta Inc. 2021-Present Part of PyTorch Dev Infra team. Working on PyTorch OSS Releases. Lead Software Engineer - Dow Jones & Company 2019-2021 Part of the team developing software and the API Services used by Dow Jones Factiva website and WSJ. Software Engineer... Read More →

PyTorch OSS Release Process pdf

Thursday September 19, 2024 3:00pm - 3:10pm PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Intermediate

3:15pm PDT

Slaying OOMs - Mark Saroufim & Jane Xu, Meta

Thursday September 19, 2024 3:15pm - 3:40pm PDT

Festival Pavilion - Breakout Room B

Have you ever hit an OOM (and wished you had more VRAM)? Who hasn't! Hop on the bus with us and feel the road become smoother as we talk about stacking together techniques like FSDP2 + QLoRa + CPU Offloading + Fused ADAM (thanks Intel) + more in PyTorch native. We will give an overview of these techniques as well as the hard edges we solved in their composition. Curious for more? Or...still OOMing? We also plan on discussing our more researchy work on offloading, pagedness, and low precision optimizers.

Speakers

Jane Xu

SWE, Meta

I'm Jane and I work on the PyTorch core library! Tell me your favorite optimizer, complain to me about your latest OOM, teach me about what you’re excited about.

Mark Saroufim

Software Engineer, Meta

Mark Saroufim is a PyTorch Engineer at Meta working on inference, compilers and community.

FINAL Slaying OOMs PTC 2024 pdf

Thursday September 19, 2024 3:15pm - 3:40pm PDT
Festival Pavilion - Breakout Room B

Breakout Sessions

Audience Intermediate

3:15pm PDT

Sponsored Session: PyTorch Support by Google Enabling Performance from Cloud to Edge - Mark Sherwood & Shauheen Zahirazami, Google

Thursday September 19, 2024 3:15pm - 3:40pm PDT

Gateway Pavilion - Cowell Theater

In this session we will cover various ways teams at google are working to help the Pytorch community achieve performance and scale from cloud to edge. We will cover how Google Cloud customers can use PyTorch and OpenXLA to get competitive performance for their ML workloads. We’ll also cover how Google AI Edge Torch works with Pytorch to help developers integrate LLMs, vision models and more to easily create new edge applications that can run on a wide set of devices.

Speakers

Mark Sherwood

Senior Product Manager, Google AI Edge, Google

Mark is a Senior Product Manager on the Google AI Edge team, responsible for LiteRT (formerly known as TensorFlow Lite) and MediaPipe. He specializes in shipping ML powered features on Android, iOS, and Web using the very smallest to the very largest on-device models.

Shauheen Zahirazami

Senior Staff Engineering Manager, Cloud Machine Learning Compute Services, Google

Shauheen has a PhD in control engineering with a BSc in applied mathematics. He is currently leading Cloud TPU Machine Learning teams at Google who are responsible for ML Frameworks and 3P ecosystem including the PyTorch teams that develop PyTorch/XLA.

PyTorch Conf 2024 PyTorch Support by Google Enabling Performance from Cloud to Edge pdf

Thursday September 19, 2024 3:15pm - 3:40pm PDT
Gateway Pavilion - Cowell Theater

Breakout Sessions

3:15pm PDT

Torch.Compile for Autograd, DDP and FSDP - Will Feng , Chien-Chin Huang & Simon Fan, Meta

Thursday September 19, 2024 3:15pm - 3:40pm PDT

Festival Pavilion - Breakout Room A

In this talk, we will present the latest advancements in torch.compile for distributed training via DDP and FSDP. We will first introduce Compiled Autograd, a torch.compile mode to fully capture the backpropagation step, including the communication collective operators used in distributed. We will then cover the improvements this new approach brought to Compiled DDP/FSDP, notably by removing DDP/FSDP graph breaks which brings the potential of improving compute/communication overlap.

Speakers

Chien-Chin Huang

Software Engineer, Meta

Software Engineer, PyTorch Distributed, Meta

Simon Fan

Software Engineer, Meta

I'm a software engineer on the PyTorch Compiler team, I focus on torch.compile for distributed training frameworks.

Will Feng

Software Engineer, Meta Platforms, Inc.

Will Feng is a Software Engineer in PyTorch Compiler team at Meta. He has been working in PyTorch core and ecosystem for the past 7 years. He is now working on and most excited about torch.compile for distributed training performance.

PTC 2024 Torch.Compile for Autograd, DDP and FSDP pdf

Thursday September 19, 2024 3:15pm - 3:40pm PDT
Festival Pavilion - Breakout Room A

Breakout Sessions

Audience Intermediate

3:40pm PDT

Coffee Break

Thursday September 19, 2024 3:40pm - 4:05pm PDT

Gateway Pavilion - Sponsor Showcase

Menus and Maps

Feeling hungry? Explore the food and beverage options available throughout the PyTorch Conference, complete with a map to guide you!

Thursday September 19, 2024 3:40pm - 4:05pm PDT
Gateway Pavilion - Sponsor Showcase

Breaks / Exhibits / Special Events

3:45pm PDT

Sponsor Scavenger Hunt Raffle Drawing

Thursday September 19, 2024 3:45pm - 4:00pm PDT

Gateway Pavilion - Sponsor Showcase

Grab your scavenger hunt card at registration, visit all our awesome sponsors, and you'll be in the running to win some fantastic prizes!

Thursday September 19, 2024 3:45pm - 4:00pm PDT
Gateway Pavilion - Sponsor Showcase

Breaks / Exhibits / Special Events

4:05pm PDT

Lightning Talk: Debiasing the Data Lifecycle - Shailvi Wakhlu, Shailvi Ventures LLC

Thursday September 19, 2024 4:05pm - 4:15pm PDT

Festival Pavilion - Breakout Room A

Biased data, results in biased decision-making. Making sure that at every step of the data lifecycle, we make conscious attempts to debias the data is an important responsibility for all data scientists. In this talk, I highlight the typical data lifecycle, and how to prevent biases at every step. ---- The key takeaways from my talk include: 1) Understanding the data lifecycle 2) What are the typical ways biases creep in 3) How we can proactively prevent and fix biases in data

Speakers

Shailvi Wakhlu

Founder, Shailvi Ventures LLC

Shailvi is a seasoned Data Leader and Self-Advocacy Expert with over sixteen years of experience building technology products. She has spoken at nearly 100 global conferences and Fortune 500 events, coached close to 500 individuals, and authored the best-selling book "Self-Advocacy... Read More →

Thursday September 19, 2024 4:05pm - 4:15pm PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Intermediate

4:05pm PDT

Understanding the LLM Inference Workload - Mark Moyou, NVIDIA

Thursday September 19, 2024 4:05pm - 4:30pm PDT

Festival Pavilion - Breakout Room B

Understanding how to effectively size a production grade LLM deployment requires understanding of the model(s), the compute hardware, quantization and parallelization methods, KV Cache budgets, input and output token length predictions, model adapter management and much more. - Why LLM inference is different to standard deep learning inference - Current and future NVIDIA GPU overview - which GPU(s) for which models and why - Understanding the importance of building inference engines - Deep recap on the attention mechanism along with different types of popular attention mechanisms used in production - Deep dive on KV Cache and managing KV Cache budgets - Parallelism (reducing latency) - mainly tensor parallelism, but data, sequence, pipeline, and expert parallelism will be highlighted - Quantization methods on weights, activations, and KV Cache to reduce engine sizes for more effective GPU utilization - Increasing throughput with inflight batching and other techniques - Detailed performance analysis of LLM deployments looking at Time to first token, inter-token latencies, llm deployment characterizations, and more that can help reduce deployment costs

Speakers

Mark Moyou

Sr. Data Scientist, NVIDIA

Dr. Mark Moyou Senior Data Scientist at NVIDIA working with enterprise clients on AI strategy and deploying machine learning applications to production. He is the host of the Caribbean Tech Pioneers Podcast, The AI Portfolio Podcast and is the Director of the Optimized AI Confere... Read More →

Thursday September 19, 2024 4:05pm - 4:30pm PDT
Festival Pavilion - Breakout Room B

Breakout Sessions

Audience Intermediate

4:05pm PDT

Startup Showcase

Thursday September 19, 2024 4:05pm - 5:30pm PDT

Gateway Pavilion - Cowell Theater

The PyTorch Conference Startup Showcase is giving emerging companies the chance to pitch to a panel of VCs looking to support AI/ML startups with high growth potential, and meet some of the best AI focused Engineers in the Industry. This is an exciting and unique opportunity for early-stage founders to showcase their ideas and breakthroughs, connect with leading VCs, and increase visibility in the generative AI and machine learning industry.

The winning startup will be announced at the Flare Party taking place after the Startup Showcase.

Finalists

Moderators

Chappy Asel

Co-founder, GenAI Collective

Successful entrepreneur with an expansive technical and operational background built across 10+ years of experience. Co-founder of the GenAI Collective: a community of founders, funders, and thought leaders built around our shared curiosity for generative AI. Ex-Apple AR/VR. Ex-Apple... Read More →

Judges

Astasia Myers

General Partner, Felicis

Astasia Myers is a General Partner at Felicis. Before joining Felicis, she was an enterprise partner at Quiet Capital and an investor at Redpoint Ventures. Astasia focuses on early-stage investing across AI, data, open source, developer tools, and security. She has invested in LaunchDarkly... Read More →

Kevin Crosby

Sr. Director, Open Source Funding, GitHub

Kevin Crosby is Senior Director leading Open Source Funding at Microsoft’s M12 Github fund. Prior to GitHub, Kevin led business development for VC and Accelerators at Carta and spent 8 years at Amazon in corporate venture and leading product, engineer, and business teams. He is... Read More →

Rajko Radovanovic

Investor, Andreessen Horowitz

Rajko Radovanovic is an investing partner on the infrastructure team at Andreessen Horowitz.

Simon Tiu

VC Investor, Vertex Ventures

Simon Tiu joined Vertex Ventures US in 2024, focusing on enterprise software and cybersecurity investments. Prior to joining Vertex Ventures, Simon worked at Qatalyst Partners, where he was a core member of the Enterprise Software team. During his tenure, he provided strategic and... Read More →

Vig Sachidananda

Investor, Gradient Ventures

Vig is an investor at Gradient Ventures.Vig received his M.S., Ph.D in Electrical Engineering from Stanford University and his B.S. in Mechanical Engineering from the University of Maryland, College Park. During his Ph.D, he worked as a seed stage software engineer at Clockwork.io... Read More →

Vijay Reddy

Partner, Mayfield Fund

Vijay Reddy brings over a decade of inception and early-stage investing experience in AI and Enterprise infrastructure. He had a front-row seat to the rise of AI and has invested across the AI stack from silicon, infrastructure, data, middleware and AI-first applications. Vijay is... Read More →

Thursday September 19, 2024 4:05pm - 5:30pm PDT
Gateway Pavilion - Cowell Theater

Startup Showcase

4:20pm PDT

CANCELED: Lightning Talk: PyTorch-Wildlife: A Collaborative Deep Learning Framework for Conservation - Zhongqi Miao, Microsoft

Thursday September 19, 2024 4:20pm - 4:30pm PDT

Festival Pavilion - Breakout Room A

The alarming decline in global biodiversity, driven by various factors, underscores the urgent need for large-scale wildlife monitoring. To address these challenges, we introduce Pytorch Wildlife, an open-source deep learning platform built on PyTorch. It is designed for creating, modifying, and sharing powerful AI models. This platform emphasizes usability and accessibility, making it accessible to individuals with limited or no technical background. It also offers a modular codebase to simplify feature expansion and further development. Pytorch-Wildlife offers an intuitive, user-friendly interface, accessible through local installation or Hugging Face, for animal detection and classification in images and videos. As two real-world applications, Pytorch-Wildlife has been utilized to train animal classification models for species recognition in the Amazon Rainforest and for invasive opossum recognition in the Galapagos Islands. The Opossum model achieves 98% accuracy, and the Amazon model has 92% recognition accuracy for 36 animals in 90% of the data. As Pytorch-Wildlife evolves, we aim to integrate more conservation tasks, addressing various environmental challenges.

Speakers

Zhongqi Miao

Research Scientist, Microsoft

My research focus is AI (especially modern computer vision) applications in environmental science and ecology. I am currently in the AI for Good Lab, working on large-scale wildlife recognition through ground-based cameras (i.e., camera traps), bioacoustics, and overhead imagery... Read More →

Thursday September 19, 2024 4:20pm - 4:30pm PDT
Festival Pavilion - Breakout Room A

Lightning Talks

Audience Beginner

4:35pm PDT

Intel GPU in Upstream PyTorch: Expanding GPU Choices and Enhancing Backend Flexibility - Eikan Wang & Min Jean Cho, Intel

Thursday September 19, 2024 4:35pm - 5:00pm PDT

Festival Pavilion - Breakout Room B

The integration of Intel GPU support into PyTorch marks a pivotal enhancement for PyTorch device and runtime. We generalized the PyTorch device and runtime to accommodate streaming devices. The generalization not only facilitates the deployment of PyTorch on ubiquitous hardware but also makes the integration of different HW backends easier. In addition, PyTorch with Intel GPU supports various Intel GPUs from the data center to the client. It enriches and democratizes PyTorch HW ecosystem. Particularly in AIPC scenarios where Intel's integrated and discrete GPUs are prevalent, Pytorch with Intel GPU can deliver promising performance and improved OOB experience in the AIPC domain that can extend PyTorch's applicability significantly.

Speakers

Eikan Wang

AI Frameworks Engineer, Intel

Eikan is a staff engineer from Intel and a DL framework tech lead having full-stack experience in DL, from various AI applications to framework, library, and DL compiler. He is actively optimizing on torch.compile stack for Intel platforms, including optimizing Inductor C++/OpenMP... Read More →

Min Jean Cho

Deep Learning Software Engineer, Intel Corporation

PyTorch Conference 2024 Intel GPU in PyTorch pdf

Thursday September 19, 2024 4:35pm - 5:00pm PDT
Festival Pavilion - Breakout Room B

Breakout Sessions

Audience Beginner

4:35pm PDT

Unlocking the Enigma: Crafting Unbiased, Transparent, and Explainable Large Language Models - Rashmi Nagpal, Patchstack

Thursday September 19, 2024 4:35pm - 5:00pm PDT

Festival Pavilion - Breakout Room A

In an era where artificial intelligence reigns supreme, the statistics are both perplexing and thought-provoking – only a mere 13% of large language models manage to transcend the realms of research and enter the practical world of production. Who bears the responsibility when these models err, spewing out biased or discriminatory outputs? It's time to demystify the complex landscape of machine learning ethics and carve a path towards a brighter, more accountable future! In this talk, firstly, we will navigate the profound impacts of large language models across diverse domains, from the lifesaving advances in medicine to safeguarding our nations through enhanced security protocols. Secondly, as we marvel at data-driven decisions laid by these models, we will confront the darker shadows cast by – the looming spectre of bias in the data. Finally, we will delve deep into the art of building interpretable models and navigating the maze of ethical considerations. Through a live demonstration in PyTorch, we will witness how to craft unbiased, transparent, and explainable models.

Speakers

Rashmi Nagpal

Machine Learning Engineer, Patchstack

Rashmi, a passionate researcher at the MIT CSAIL and machine learning engineer at Patchstack, is dedicated to crafting beautiful AI applications. With nearly 5 years of industrial experience, she has brought ideas to life at pre-seed startups and contributed to impactful redesigns... Read More →

PyTorch Conference Rashmi Nagpal.pptx pdf

Thursday September 19, 2024 4:35pm - 5:00pm PDT
Festival Pavilion - Breakout Room A

Breakout Sessions

Audience Intermediate
Slides Attached Yes

5:05pm PDT

Implementing a Custom Torch.Compile Backend - A Case Study - Maanav Dalal & Yulong Wang, Microsoft

Thursday September 19, 2024 5:05pm - 5:30pm PDT

Festival Pavilion - Breakout Room B

This presentation will dive into the development of the ONNXRuntime (ORT) backend for torch.compile. We'll cover the implementation process, starting with a PyTorch 2.0 generated FX graph, highlighting the unique challenges encountered when serving ORT-specific scenarios and how we solved them. Attendees will gain insights into optimizing performance, overcoming integration hurdles, and achieving efficient execution. Whether you're a developer looking to extend PyTorch's capabilities for your own use cases, keen to learn about ONNX Runtime, or interested in backend performance optimization, and the many steps we've taken to get to where we are now, this session promises valuable takeaways and practical knowledge.

Speakers

Yulong Wang

Software Engineer, Microsoft

Maanav Dalal

Program Manager, Microsoft

PM @Microsoft, working on the ONNX Exporter team. I adore learning about consumer tech and experimenting with bleeding edge software. I'm passionate about creating delightful user experiences.

PyTorch Conference 2024 pptx

Thursday September 19, 2024 5:05pm - 5:30pm PDT
Festival Pavilion - Breakout Room B

Breakout Sessions

Audience Intermediate

5:05pm PDT

The Ethical Implications of AI and the Environment: A Focus on Water - Amber Hasan, Ethical Tech AI & Senegal Tuklor Williams, Broken Pencil Pictures llc

Thursday September 19, 2024 5:05pm - 5:30pm PDT

Festival Pavilion - Breakout Room A

Artificial Intelligence (AI) has the potential to revolutionize various sectors, including environmental conservation and water management. However, the deployment of AI technologies raises ethical questions about the environmental impact, particularly water resources. This presentation will discuss the ethical implications of AI concerning water while also exploring how AI can both positively and negatively affect water resources along with the broader ecosystem. My goal is to facilitate a critical conversation around how to balance technological advancements with environmental stewardship. Objectives: Understanding Ethical Implications: Provide an in depth overview of how AI impacts water resources. Focus on ethical concerns related to AI's water footprint, including, but not limited to energy consumption and water usage in data centers. Explore Positive Applications: Talk about the possible successful implementations of AI in water conservation, pollution monitoring, and efficient resource management. Discuss potential future applications where AI could contribute to sustainable water management and connect stakeholders to address ethical concerns and solutions.

Speakers

Amber Hasan

Owner, Ethical Tech AI

Amber Hasan is an interdisciplinary artist and community organizer focused on using Creative Practice as a tool for change. Amber is Co-Founder of The Sister Tour collective, she has worked with photographer LaToya Ruby Frazier regarding the Flint Water Crisis, she is a Board Member... Read More →

Senegal Tuklor Williams

C.O.O., ETHICAL TECH AI

From the standpoint of Broken Pencil Pictures, we are a dynamic and multi-disciplinary creative cil company. Our achievements are a testament to our dedication to social change and the betterment of our community. "The Sister Tour" stands out as an initiative through which we distributed... Read More →

Thursday September 19, 2024 5:05pm - 5:30pm PDT
Festival Pavilion - Breakout Room A

Breakout Sessions

Audience Any

5:30pm PDT

The PyTorch Flare Party Sponsored by Hugging Face

Thursday September 19, 2024 5:30pm - 8:00pm PDT

Gateway Pavilion - Sponsor Showcase

Menus and Maps

Join us as we ignite the evening with something blazing hot.

THE NIGHT HEATS UP AT 7:45 PM! - You won't want to miss the sizzling finale of the PyTorch Flare Party!

It'll be LIT in more ways than one!

Feeling hungry? Explore the food and beverage options available throughout the PyTorch Conference, complete with a map to guide you!

Thursday September 19, 2024 5:30pm - 8:00pm PDT
Gateway Pavilion - Sponsor Showcase

Breaks / Exhibits / Special Events