Loading…
Attending this event?
September 18-19, 2024
San Francisco, California
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC-7). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Breakout Sessions clear filter
arrow_back View All Dates
Thursday, September 19
 

10:50am PDT

Sponsored Session: Democratizing AI: Powering the Future with Arm’s Global Compute Ecosystem - Gian Marco Iodice, Arm
Thursday September 19, 2024 10:50am - 11:15am PDT
Arm is excited to be at the center of the world's largest compute ecosystem at the dawn of the AI era. A key tenant of our mission is to democratize AI capabilities, empowering millions of developers to put advanced AI features into the hands of billions of users.

In this presentation, we'll explore how Arm is enabling the world’s leading open-source AI frameworks to leverage power-efficient Arm-based computing platforms and Arm architecture features, as a tool for enabling fast and secure AI workloads. The session focuses on how our strategic partnership with the Pytorch and Executorch community is enabling a seamless and transparent developer experience, to run workloads everywhere from cloud to edge. This session will highlight some of our optimized libraries, upstreamed contributions and a wealth of AI-related developer material to build the future of AI on Arm.
Speakers
avatar for Gian-Marco Iodice

Gian-Marco Iodice

GenAI Engineering Lead, Arm
Gian Marco Iodice is an experienced edge and mobile computing specialist at Arm for machine learning (ML) and leads engineering development for on-device GenAI. He received the MSc with honors in electronic engineering from the University of Pisa (Italy), where he specialized in HW/SW... Read More →
Thursday September 19, 2024 10:50am - 11:15am PDT
Gateway Pavilion - Cowell Theater

10:50am PDT

The Rise of `Transformers` in the Growing PyTorch Ecosystem - Arthur Zucker, Hugging Face
Thursday September 19, 2024 10:50am - 11:15am PDT
Explore how the `tranformers` library grows and adapts to the fast paced and ever-changing AI field to bring the best to the AI community
Speakers
avatar for Arthur Zucker

Arthur Zucker

Core Maintainer, Hugging Face
Arthur is a Core maintainer at Hugging Face, maintaining several critical libraries such as transformers and tokenizers. He is the owner of the text and LLM parts of Hugging Face's open-source toolkits, resulting in the implementations of LLaMa, Mistral, MoEs, etc and torch.compile... Read More →
Thursday September 19, 2024 10:50am - 11:15am PDT
Festival Pavilion - Breakout Room B

11:20am PDT

Sponsored Session: Torchchat: A Showcase of PyTorch LLM Ubiquity - Jack Khuu & Jesse White, Meta
Thursday September 19, 2024 11:20am - 11:45am PDT
This talk explores the journey of enabling LLMs in the PyTorch ecosystem, as well as how the teams behind AOT Inductor, ExecuTorch, and torchao collaborated to create torchchat, a showcase of PyTorch’s ability to run LLM inference everywhere.

Torchchat demonstrates the ubiquity, simplicity, and quality of PyTorch’s LLM support through performant, reproducible implementations for not only Python environments, but on desktop, server, and on-device as-well.

All of our work is open source and available on GitHub.
Speakers
avatar for Jack Khuu

Jack Khuu

Software Engineer, Meta
Software Engineer @ Meta working on the PyTorch Edge team. TL for torchchat, which is PyTorch's showcase of LLM inference ubiquity (Python, Desktops, Mobile, etc.). More broadly, I focus on the "Experience" of PyTorch Edge, encompassing User, Developer, and Community Experience.Ex-Lecturer... Read More →
avatar for Jesse White

Jesse White

Software Engineering Manager, Meta
Jesse is an engineering manager at PyTorch @ Meta, where he supports the Edge Experience team in improving the experience for on-device inference and training, including mobile, laptops, and embedded devices. With nearly 20 years of experience in startups, Jesse is passionate about... Read More →
Thursday September 19, 2024 11:20am - 11:45am PDT
Festival Pavilion - Breakout Room A
  Breakout Sessions

11:20am PDT

Training MoEs at Scale with PyTorch - Mihir Patel & Brian Chu, Databricks
Thursday September 19, 2024 11:20am - 11:45am PDT
Mixture-of-Experts MoE (models) are becoming an increasingly popular architecture choice for large language models (LLMs). In this talk, we describe how to train MoE models with PyTorch. After discussing various performance tradeoffs, we use PyTorch distributed tools like DTensor to build custom parallelism approaches, including expert parallelism via MegaBlocks. We then show how to get near linear scaling to thousands of GPUs, combining PyTorch FSDP and HSDP with our parallelism strategies. We discuss many of the challenges of training at scale, including communication bottlenecks, hardware failures, and networking challenges. We further improve training at scale setups using tools like PyTorch Distributed Checkpointing for rapid saving and loading. We then highlight further optimizations to minimize challenges only present at scale, such as object store failures for large checkpoints.
Speakers
avatar for Mihir Patel

Mihir Patel

Research Engineer, Databricks
Mihir Patel is a Research Engineer at MosaicML / Databricks, where he works on distributed training at scale and serves as the tech lead for Composer, an open-source deep learning training library. His primary focus is on large model training, and he has helped build several open... Read More →
avatar for Brian Chu

Brian Chu

Research Engineer, Databricks
Brian is a Research Engineer at MosaicML / Databricks, where he contributes to Composer and Foundry, open-source libraries for training LLMs. He has been involved in the DBRX project and products like the Databricks finetuning and pretraining API. Prior to joining Databricks, Brian... Read More →
Thursday September 19, 2024 11:20am - 11:45am PDT
Festival Pavilion - Breakout Room B

2:15pm PDT

Building PyTorch Computer Vision Algorithms for 100 Skin Shades - Emmanuel Acheampong, roboMUA
Thursday September 19, 2024 2:15pm - 2:40pm PDT
At roboMUA we're leading the charge in building predictive AI models for diverse skin shades with the use of Convolutional Neural Networks (CNNs), and harnessing the power of Generative Adversarial Networks (GANs) specifically for generating realistic images of black hairstyles. Our session showcases PyTorch's versatility in both predictive and generative tasks, offering a comprehensive approach to inclusive AI. For predictive AI models, we leverage PyTorch's flexible framework to develop CNNs. Through innovative techniques in feature engineering and model architecture design, we demonstrate how PyTorch enables accurate prediction across 100 skin shades. Simultaneously, we showcase the transformative potential of GANs in the realm of black hairstyles. By training GANs on a curated dataset of diverse hair textures and styles, we illustrate how PyTorch facilitates the generation of lifelike images that celebrate the beauty and diversity of black hair. Attendees will gain insights into the data preprocessing, model training, and evaluation processes and and learn how PyTorch empowers developers to build inclusive solutions.
Speakers
avatar for Emmanuel Acheampong

Emmanuel Acheampong

CEO / Head of AI, yShade.ai (formerly roboMUA)
Emmanuel Acheampong is a co-founder and CEO of roboMUA - an innovative AI solutions company with a visionary focus on catering to all skin shades and types. He graduated from Notre Dame’s ESTEEM program with a Masters thesis on the intersection of Artificial Intelligence and directed... Read More →
Thursday September 19, 2024 2:15pm - 2:40pm PDT
Gateway Pavilion - Cowell Theater

2:15pm PDT

Data-Dependent Shapes in PT2 - Edward Yang, Meta
Thursday September 19, 2024 2:15pm - 2:40pm PDT
Data-dependent shapes are ubiquitous whenever you want to take advantage of sparsity in your data representation, whether it is in recommendation systems, mixture of experts or other use cases. We have made a lot of improvements to torch.compile's support for capturing and compiling data dependent shapes, but they also require some user knowledge to work with effectively. This talk will give an overview of PT2's facilities for data dependent compute and how to use them effectively.
Speakers
avatar for Edward Z. Yang

Edward Z. Yang

Research Engineer, Meta
Edward Yang has worked on PyTorch at Meta since nearly the very beginning. Currently, he works on all aspects of PT2, but with a particular focus on dynamic shapes support across the stack.
Thursday September 19, 2024 2:15pm - 2:40pm PDT
Festival Pavilion - Breakout Room A

2:15pm PDT

vLLM: Easy, Fast, and Cheap LLM Serving for Everyone - Woosuk Kwon & Xiaoxuan Liu, UC Berkeley
Thursday September 19, 2024 2:15pm - 2:40pm PDT
We will present vLLM, an open-source high-performance LLM inference engine built on top of PyTorch. Starting as a research project at UC Berkeley, vLLM has been one of the fastest and most popular LLM inference solutions in industry, reaching 20K+ stars and 350+ contributors. In this talk, we will cover how vLLM adopts various LLM inference optimizations and how it supports various AI accelerators such as AMD GPUs, Google TPUs, and AWS Inferentia. Also, we will discuss how vLLM benefits from PyTorch 2 and its ecosystem.
Speakers
avatar for Lily Liu

Lily Liu

Student, UCB
Lily (Xiaoxuan) Liu is a PhD student at UC Berkeley, working with Professors Ion Stoica and Alvin Cheung. Her research focuses on machine learning systems, particularly optimizing latency for LLM inference and addressing memory bottlenecks in LLM systems. Her recent work explores... Read More →
avatar for Woosuk Kwon

Woosuk Kwon

PhD Student, UC Berkeley
Woosuk Kwon is a Ph.D. student at UC Berkeley, advised by Prof. Ion Stoica. He is interested in building practical, flexible, and high-performance software systems for emerging applications such as large language models. Recently, he has been developing vLLM, a high-performance open-source... Read More →
Thursday September 19, 2024 2:15pm - 2:40pm PDT
Festival Pavilion - Breakout Room B

2:45pm PDT

Blobs to Clips: Efficient End-to-End Video Data Loading - Andrew Ho & Ahmad Sharif, Meta
Thursday September 19, 2024 2:45pm - 3:10pm PDT
The PyTorch team has improved training speed by an order of magnitude for teams at Meta working on Small-to-Large-Scale MultiModal Video models. In this talk we’ll share our learnings on reducing GPU starvation by overcoming data loading challenges such as dealing with large distributed datasets, worker imbalance, compute-bottlenecks due to parallel video decoding and sampling, checkpointing, and debuggability. As part of our commitment to open-source, we are releasing a new decoding library and updating existing PyTorch libraries on GitHub, and invite feedback and contributions from the community.
Speakers
avatar for Ahmad Sharif

Ahmad Sharif

Software Engineer, Meta
SWE in Pytorch Content Domains Past: SWE at Google in Search, Privacy, ChromeOS
avatar for Andrew Ho

Andrew Ho

Machine Learning Engineer, Meta Platforms
We are ML Engineers at Meta on PyTorch working on multi-modal LLM dataloading
Thursday September 19, 2024 2:45pm - 3:10pm PDT
Gateway Pavilion - Cowell Theater

2:45pm PDT

Torchtitan: Large-Scale LLM Training Using Native PyTorch 3D Parallelism - Wanchao Liang, Meta & Linsong Chu, IBM Research
Thursday September 19, 2024 2:45pm - 3:10pm PDT
torchtitan is a proof-of-concept for Large-scale LLM training using native PyTorch. It is a repo that showcases PyTorch's latest distributed training features in a clean, minimal codebase. We show-cased end to end large scale training features enablement: 1. 3D/4D Parallelism 2. Efficient distributed checkpoint save/load/resharding 3. Many efficient training techniques including Float8, torch.compile, activation checkpoint, etc.
Speakers
avatar for Wanchao Liang

Wanchao Liang

Software Engineer, Meta Platforms, Inc.
Software Engineer at Meta, PyTorch team Tech Lead in PyTorch Distributed training. Author of torchtitan, Tensor Parallel and DTensor, a fundamental distributed abstraction to perform distributed computation. Previously worked on the TorchScript compiler, ONNX.
avatar for LINSONG CHU

LINSONG CHU

Senior Technical Staff Member, IBM Research
Linsong is a STSM at IBM Research, focusing on FSDP, torch compile and FP8 in the area of pre-training.
Thursday September 19, 2024 2:45pm - 3:10pm PDT
Festival Pavilion - Breakout Room B

3:15pm PDT

Slaying OOMs - Mark Saroufim & Jane Xu, Meta
Thursday September 19, 2024 3:15pm - 3:40pm PDT
Have you ever hit an OOM (and wished you had more VRAM)? Who hasn't! Hop on the bus with us and feel the road become smoother as we talk about stacking together techniques like FSDP2 + QLoRa + CPU Offloading + Fused ADAM (thanks Intel) + more in PyTorch native. We will give an overview of these techniques as well as the hard edges we solved in their composition. Curious for more? Or...still OOMing? We also plan on discussing our more researchy work on offloading, pagedness, and low precision optimizers.
Speakers
avatar for Jane Xu

Jane Xu

SWE, Meta
I'm Jane and I work on the PyTorch core library! Tell me your favorite optimizer, complain to me about your latest OOM, teach me about what you’re excited about.
avatar for Mark Saroufim

Mark Saroufim

Software Engineer, Meta
Mark Saroufim is a PyTorch Engineer at Meta working on inference, compilers and community.
Thursday September 19, 2024 3:15pm - 3:40pm PDT
Festival Pavilion - Breakout Room B

3:15pm PDT

Sponsored Session: PyTorch Support by Google Enabling Performance from Cloud to Edge - Mark Sherwood & Shauheen Zahirazami, Google
Thursday September 19, 2024 3:15pm - 3:40pm PDT
In this session we will cover various ways teams at google are working to help the Pytorch community achieve performance and scale from cloud to edge. We will cover how Google Cloud customers can use PyTorch and OpenXLA to get competitive performance for their ML workloads.  We’ll also cover how Google AI Edge Torch works with Pytorch to help developers integrate LLMs, vision models and more to easily create new edge applications that can run on a wide set of devices.
Speakers
avatar for Mark Sherwood

Mark Sherwood

Senior Product Manager, Google AI Edge, Google
Mark is a Senior Product Manager on the Google AI Edge team, responsible for LiteRT (formerly known as TensorFlow Lite) and MediaPipe. He specializes in shipping ML powered features on Android, iOS, and Web using the very smallest to the very largest on-device models.
avatar for Shauheen Zahirazami

Shauheen Zahirazami

Senior Staff Engineering Manager, Cloud Machine Learning Compute Services, Google
Shauheen has a PhD in control engineering with a BSc in applied mathematics. He is currently leading Cloud TPU Machine Learning teams at Google who are responsible for ML Frameworks and 3P ecosystem including the PyTorch teams that develop PyTorch/XLA.
Thursday September 19, 2024 3:15pm - 3:40pm PDT
Gateway Pavilion - Cowell Theater

3:15pm PDT

Torch.Compile for Autograd, DDP and FSDP - Will Feng , Chien-Chin Huang & Simon Fan, Meta
Thursday September 19, 2024 3:15pm - 3:40pm PDT
In this talk, we will present the latest advancements in torch.compile for distributed training via DDP and FSDP. We will first introduce Compiled Autograd, a torch.compile mode to fully capture the backpropagation step, including the communication collective operators used in distributed. We will then cover the improvements this new approach brought to Compiled DDP/FSDP, notably by removing DDP/FSDP graph breaks which brings the potential of improving compute/communication overlap.
Speakers
CH

Chien-Chin Huang

Software Engineer, Meta
Software Engineer, PyTorch Distributed, Meta
avatar for Simon Fan

Simon Fan

Software Engineer, Meta
I'm a software engineer on the PyTorch Compiler team, I focus on torch.compile for distributed training frameworks.
avatar for Will Feng

Will Feng

Software Engineer, Meta Platforms, Inc.
Will Feng is a Software Engineer in PyTorch Compiler team at Meta. He has been working in PyTorch core and ecosystem for the past 7 years. He is now working on and most excited about torch.compile for distributed training performance.
Thursday September 19, 2024 3:15pm - 3:40pm PDT
Festival Pavilion - Breakout Room A

4:05pm PDT

Understanding the LLM Inference Workload - Mark Moyou, NVIDIA
Thursday September 19, 2024 4:05pm - 4:30pm PDT
Understanding how to effectively size a production grade LLM deployment requires understanding of the model(s), the compute hardware, quantization and parallelization methods, KV Cache budgets, input and output token length predictions, model adapter management and much more. - Why LLM inference is different to standard deep learning inference - Current and future NVIDIA GPU overview - which GPU(s) for which models and why - Understanding the importance of building inference engines - Deep recap on the attention mechanism along with different types of popular attention mechanisms used in production - Deep dive on KV Cache and managing KV Cache budgets - Parallelism (reducing latency) - mainly tensor parallelism, but data, sequence, pipeline, and expert parallelism will be highlighted - Quantization methods on weights, activations, and KV Cache to reduce engine sizes for more effective GPU utilization - Increasing throughput with inflight batching and other techniques - Detailed performance analysis of LLM deployments looking at Time to first token, inter-token latencies, llm deployment characterizations, and more that can help reduce deployment costs
Speakers
avatar for Mark Moyou

Mark Moyou

Sr. Data Scientist, NVIDIA
Dr. Mark Moyou Senior Data Scientist at NVIDIA working with enterprise clients on AI strategy and deploying machine learning applications to production. He is the host of the Caribbean Tech Pioneers Podcast, The AI Portfolio Podcast and is the Director of the Optimized AI Confere... Read More →
Thursday September 19, 2024 4:05pm - 4:30pm PDT
Festival Pavilion - Breakout Room B

4:35pm PDT

Intel GPU in Upstream PyTorch: Expanding GPU Choices and Enhancing Backend Flexibility - Eikan Wang & Min Jean Cho, Intel
Thursday September 19, 2024 4:35pm - 5:00pm PDT
The integration of Intel GPU support into PyTorch marks a pivotal enhancement for PyTorch device and runtime. We generalized the PyTorch device and runtime to accommodate streaming devices. The generalization not only facilitates the deployment of PyTorch on ubiquitous hardware but also makes the integration of different HW backends easier. In addition, PyTorch with Intel GPU supports various Intel GPUs from the data center to the client. It enriches and democratizes PyTorch HW ecosystem. Particularly in AIPC scenarios where Intel's integrated and discrete GPUs are prevalent, Pytorch with Intel GPU can deliver promising performance and improved OOB experience in the AIPC domain that can extend PyTorch's applicability significantly.
Speakers
avatar for Eikan Wang

Eikan Wang

AI Frameworks Engineer, Intel
Eikan is a staff engineer from Intel and a DL framework tech lead having full-stack experience in DL, from various AI applications to framework, library, and DL compiler. He is actively optimizing on torch.compile stack for Intel platforms, including optimizing Inductor C++/OpenMP... Read More →
MJ

Min Jean Cho

Deep Learning Software Engineer, Intel Corporation
Thursday September 19, 2024 4:35pm - 5:00pm PDT
Festival Pavilion - Breakout Room B

4:35pm PDT

Unlocking the Enigma: Crafting Unbiased, Transparent, and Explainable Large Language Models - Rashmi Nagpal, Patchstack
Thursday September 19, 2024 4:35pm - 5:00pm PDT
In an era where artificial intelligence reigns supreme, the statistics are both perplexing and thought-provoking – only a mere 13% of large language models manage to transcend the realms of research and enter the practical world of production. Who bears the responsibility when these models err, spewing out biased or discriminatory outputs? It's time to demystify the complex landscape of machine learning ethics and carve a path towards a brighter, more accountable future! In this talk, firstly, we will navigate the profound impacts of large language models across diverse domains, from the lifesaving advances in medicine to safeguarding our nations through enhanced security protocols. Secondly, as we marvel at data-driven decisions laid by these models, we will confront the darker shadows cast by – the looming spectre of bias in the data. Finally, we will delve deep into the art of building interpretable models and navigating the maze of ethical considerations. Through a live demonstration in PyTorch, we will witness how to craft unbiased, transparent, and explainable models.
Speakers
avatar for Rashmi Nagpal

Rashmi Nagpal

Machine Learning Engineer, Patchstack
Rashmi, a passionate researcher at the MIT CSAIL and machine learning engineer at Patchstack, is dedicated to crafting beautiful AI applications. With nearly 5 years of industrial experience, she has brought ideas to life at pre-seed startups and contributed to impactful redesigns... Read More →
Thursday September 19, 2024 4:35pm - 5:00pm PDT
Festival Pavilion - Breakout Room A
  Breakout Sessions

5:05pm PDT

Implementing a Custom Torch.Compile Backend - A Case Study - Maanav Dalal & Yulong Wang, Microsoft
Thursday September 19, 2024 5:05pm - 5:30pm PDT
This presentation will dive into the development of the ONNXRuntime (ORT) backend for torch.compile. We'll cover the implementation process, starting with a PyTorch 2.0 generated FX graph, highlighting the unique challenges encountered when serving ORT-specific scenarios and how we solved them. Attendees will gain insights into optimizing performance, overcoming integration hurdles, and achieving efficient execution. Whether you're a developer looking to extend PyTorch's capabilities for your own use cases, keen to learn about ONNX Runtime, or interested in backend performance optimization, and the many steps we've taken to get to where we are now, this session promises valuable takeaways and practical knowledge.
Speakers
YW

Yulong Wang

Software Engineer, Microsoft
avatar for Maanav Dalal

Maanav Dalal

Program Manager, Microsoft
PM @Microsoft, working on the ONNX Exporter team. I adore learning about consumer tech and experimenting with bleeding edge software. I'm passionate about creating delightful user experiences.
Thursday September 19, 2024 5:05pm - 5:30pm PDT
Festival Pavilion - Breakout Room B

5:05pm PDT

The Ethical Implications of AI and the Environment: A Focus on Water - Amber Hasan, Ethical Tech AI & Senegal Tuklor Williams, Broken Pencil Pictures llc
Thursday September 19, 2024 5:05pm - 5:30pm PDT
Artificial Intelligence (AI) has the potential to revolutionize various sectors, including environmental conservation and water management. However, the deployment of AI technologies raises ethical questions about the environmental impact, particularly water resources. This presentation will discuss the ethical implications of AI concerning water while also exploring how AI can both positively and negatively affect water resources along with the broader ecosystem. My goal is to facilitate a critical conversation around how to balance technological advancements with environmental stewardship. Objectives: Understanding Ethical Implications: Provide an in depth overview of how AI impacts water resources. Focus on ethical concerns related to AI's water footprint, including, but not limited to energy consumption and water usage in data centers. Explore Positive Applications: Talk about the possible successful implementations of AI in water conservation, pollution monitoring, and efficient resource management. Discuss potential future applications where AI could contribute to sustainable water management and connect stakeholders to address ethical concerns and solutions.
Speakers
avatar for Amber Hasan

Amber Hasan

Owner, Ethical Tech AI
Amber Hasan is an interdisciplinary artist and community organizer focused on using Creative Practice as a tool for change. Amber is Co-Founder of The Sister Tour collective, she has worked with photographer LaToya Ruby Frazier regarding the Flint Water Crisis, she is a Board Member... Read More →
avatar for Senegal Tuklor Williams

Senegal Tuklor Williams

C.O.O., ETHICAL TECH AI
From the standpoint of Broken Pencil Pictures, we are a dynamic and multi-disciplinary creative cil company. Our achievements are a testament to our dedication to social change and the betterment of our community. "The Sister Tour" stands out as an initiative through which we distributed... Read More →
Thursday September 19, 2024 5:05pm - 5:30pm PDT
Festival Pavilion - Breakout Room A
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Audience
  • Slides Attached
  • Timezone

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -