Loading…
Attending this event?
September 18-19, 2024
San Francisco, California
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC-7). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

arrow_back View All Dates
Thursday, September 19
 

10:50am PDT

Lightning Talk: d-Matrix LLM Compression Flow Based on Torch.Fx: Simplifying PTQ/QAT - Zifei Xu & Tristan Webb, d-Matrix Corporation
Thursday September 19, 2024 10:50am - 11:00am PDT
We introduce dmx-compressor, d-Matrix's open-source LLM compression toolkit that is modular, robust, efficient, and user-friendly. It utilizes symbolic tracing and fx.Transformer for network compression while keeping the model a first-class citizen in PyTorch for the user, despite prevalent graph dynamism in LLMs. It achieves this by maintaining both the original nn.Module and a just-in-time (JIT) traced and transformed fx.GraphModule representation behind the scenes, in conjunction with an abstraction that cleanly decouples network compression from the original model graph definition. This design allows the FXIR to dynamically adapt to diverse forward call signatures and flow-control arguments throughout quantization-aware training and post-training quantization written in plain PyTorch, yielding a compressed FXIR fully compatible with application-level APIs like the Hugging Face pipeline. We also provide a graph visualizer based on fx.Interpreter for ease of debugging. We believe this project shall empower the community to build efficient LLMs for deployment on custom hardware accelerators and contribute to the PyTorch ecosystem.
Speakers
avatar for Zifei Xu

Zifei Xu

Senior Machine Learning Research Engineer, d-Matrix Corporation
Zifei is a Senior Machine Learning Research Engineer at d-Matrix. Her current work focuses on developing model quantization pipelines and efficient quantization algorithms. She graduated from Stanford University with a Master's degree in Computational & Mathematical Engineering and... Read More →
avatar for Tristan Webb

Tristan Webb

ML Engineer, d-Matrix
Tristan's background is primarily in computer science and mathematics, and which let him to graduate with a PhD in Complexity Science at the University of Warwick, where he worked with large computational neuroscience models of spiking neural networks using simulators written in C... Read More →
Thursday September 19, 2024 10:50am - 11:00am PDT
Festival Pavilion - Breakout Room A
  Lightning Talks

11:05am PDT

Lightning Talk: LLMs on Edge with AI Accelerators - Chen Lai, Kimish Patel & Cemal Bilgin, Meta
Thursday September 19, 2024 11:05am - 11:15am PDT
LLMs are known to be compute heavy and consume lots of resources (almost all resources on phones), including memory and power. A natural thought is to leverage the AI hardware accelerators, for example, Apple Neural Engine (ANE) on Apple devices and HTP on Qualcomm SoCs, to make it run fast and efficiently. Only by optimizing the model latency, memory consumption and power usage to a certain level will users be interested in installing the models on their devices. In this session, we’d like to introduce how we leverage these AI accelerators within the PyTorch ecosystem to achieve the state-of-art performance for llama3 on device, via ExecuTorch and the partnership with Apple and Qualcomm. Hardware companies usually have their own AI accelerators. Likely they have different characteristics, one may support a list of different operators than others, and one may only support static shapes (like HTP). However, transformers-based optimization can be generic. We’ll discuss in more detail how we apply the generic optimization as well as the backend specific optimization. The techniques we applied here are not just for LLMs, but can be applied to other transformer-based models.
Speakers
avatar for Kimish Patel

Kimish Patel

Software Engineer, Meta Platforms
Kimish has worked on enabling PyTorch on Meta's family of apps, primarily focusing on performance optimizations. His past experiences include hardware/software co-design, CPU architecture, and CPU/GPU performance optimization.
avatar for Chen Lai

Chen Lai

Software Engineer, Meta
Software engineers focusing on bringing up accelerators on devices
avatar for CEMAL Bilgin

CEMAL Bilgin

Engineering Manager, Meta
Engineering Manager PyTorch Edge Acceleration
Thursday September 19, 2024 11:05am - 11:15am PDT
Festival Pavilion - Breakout Room A
  Lightning Talks

11:20am PDT

Lightning Talk: Building and Supporting the Chinese PyTorch Community: Resources, Tutorials, and Engagement - Zong Zesheng, Huawei
Thursday September 19, 2024 11:20am - 11:30am PDT
Description: This proposal aims to provide a comprehensive introduction to the Chinese PyTorch community, we hope to inspire more users to join and contribute, fostering a vibrant and inclusive environment for PyTorch enthusiasts in China. Chinese PyTorch Homepage Introduction to the official Chinese version of the PyTorch website, highlighting its features. Navigation tips and key sections, such as documentation, tutorials, and community events. Improve the connection of users from China with PyTorch Community. Localized Tutorials and Documentation The 2.x version not have Translated version, it hard to catch up with latest features of PyTorch if the beginner not good at English. We translated official documents and tutorials, covering everything from basic PyTorch concepts to advanced applications. Interactive tutorials No interactive tutorials(Like Google Colab) for Chinese students or beginners before, they have to setup environment before start with PyTorch, which might be hard for beginners. And now, an online notebook & tutorials are available to practice or tuning steps for beginners.
Speakers
avatar for zong zesheng

zong zesheng

Software Engineer, Huawei
Currently, trying to let Chinese users to have easier access to PyTorch resources and make a friendly user experiences for beginners.
Thursday September 19, 2024 11:20am - 11:30am PDT
Gateway Pavilion - Cowell Theater
  Lightning Talks

11:20am PDT

Sponsored Session: Torchchat: A Showcase of PyTorch LLM Ubiquity - Jack Khuu & Jesse White, Meta
Thursday September 19, 2024 11:20am - 11:45am PDT
This talk explores the journey of enabling LLMs in the PyTorch ecosystem, as well as how the teams behind AOT Inductor, ExecuTorch, and torchao collaborated to create torchchat, a showcase of PyTorch’s ability to run LLM inference everywhere.

Torchchat demonstrates the ubiquity, simplicity, and quality of PyTorch’s LLM support through performant, reproducible implementations for not only Python environments, but on desktop, server, and on-device as-well.

All of our work is open source and available on GitHub.
Speakers
avatar for Jack Khuu

Jack Khuu

Software Engineer, Meta
Software Engineer @ Meta working on the PyTorch Edge team. TL for torchchat, which is PyTorch's showcase of LLM inference ubiquity (Python, Desktops, Mobile, etc.). More broadly, I focus on the "Experience" of PyTorch Edge, encompassing User, Developer, and Community Experience.Ex-Lecturer... Read More →
avatar for Jesse White

Jesse White

Software Engineering Manager, Meta
Jesse is an engineering manager at PyTorch @ Meta, where he supports the Edge Experience team in improving the experience for on-device inference and training, including mobile, laptops, and embedded devices. With nearly 20 years of experience in startups, Jesse is passionate about... Read More →
Thursday September 19, 2024 11:20am - 11:45am PDT
Festival Pavilion - Breakout Room A
  Breakout Sessions

12:00pm PDT

Lightning Talk: Optimized PyTorch Inference on aarch64 Linux CPUs - Sunita Nadampalli, Amazon (AWS)
Thursday September 19, 2024 12:00pm - 12:10pm PDT
In the last 2 years we've optimized performance of PyTorch on Arm processors. The optimizations have included changes to ATen, C10, MKLDNN operators, GEMM backend, and Torch inductor. In many cases instead of writing our own kernel we integrated the Arm compute library, used fastmath kernels with format types like bf16, implemented operator caching, selected optimal backend based on the input context etc. Through these optimizations we improved performance by over 2x. In this presentation first we will talk about how we went across this process, what those optimizations are, performance numbers for AWS Graviton3 processors for around 75 models, and CI/CD workflow details. Next, we will walk through a sample PyTorch application showing basic usage, how to tune runtime and the resulting speed up. At the end of the presentation attendees will learn about PyTorch performance optimizations on Arm processors, how to use them, and the areas where they can collaborate to further improve PyTorch for aarch64 CPUs.
Speakers
avatar for Sunita Nadampalli

Sunita Nadampalli

Software Development Manager, Amazon/AWS
Sunita Nadampalli is a Software Development Manager at AWS. She leads Graviton software performance optimizations for AI/ML and HPC workloads. She is passionate about open source software development and delivering high-performance and sustainable software solutions with Arm SoCs... Read More →
Thursday September 19, 2024 12:00pm - 12:10pm PDT
Festival Pavilion - Breakout Room B
  Lightning Talks
  • Audience Any
  • Slides Attached Yes

12:10pm PDT

Lightning Talk: Implementing and Using Iterable Datasets: What Could Go Wrong? - Nicolas Hug, Meta
Thursday September 19, 2024 12:10pm - 12:20pm PDT
PyTorch supports two kinds of datasets: Iterable datasets and indexable "map-style" datasets. Iterable datasets can be more flexible and potentially faster than their indexable cousins. They are also much harder to use correctly, and can easily lead to silently wrong results. This talk is a quick and fun intro to some of the traps that Iterable datasets lay out for you, with some tips to help you avoid them.
Speakers
avatar for Nicolas Hug

Nicolas Hug

Research Engineer, Meta
Nicolas is a software engineer in the PyTorch team at Meta, where he mainly contributes to the torchvision library. Prior to that, Nicolas was a research scientist at Columbia University, where he became part of the scikit-learn core development team. Nicolas holds a PhD in machine... Read More →
Thursday September 19, 2024 12:10pm - 12:20pm PDT
Gateway Pavilion - Cowell Theater
  Lightning Talks

4:35pm PDT

Unlocking the Enigma: Crafting Unbiased, Transparent, and Explainable Large Language Models - Rashmi Nagpal, Patchstack
Thursday September 19, 2024 4:35pm - 5:00pm PDT
In an era where artificial intelligence reigns supreme, the statistics are both perplexing and thought-provoking – only a mere 13% of large language models manage to transcend the realms of research and enter the practical world of production. Who bears the responsibility when these models err, spewing out biased or discriminatory outputs? It's time to demystify the complex landscape of machine learning ethics and carve a path towards a brighter, more accountable future! In this talk, firstly, we will navigate the profound impacts of large language models across diverse domains, from the lifesaving advances in medicine to safeguarding our nations through enhanced security protocols. Secondly, as we marvel at data-driven decisions laid by these models, we will confront the darker shadows cast by – the looming spectre of bias in the data. Finally, we will delve deep into the art of building interpretable models and navigating the maze of ethical considerations. Through a live demonstration in PyTorch, we will witness how to craft unbiased, transparent, and explainable models.
Speakers
avatar for Rashmi Nagpal

Rashmi Nagpal

Machine Learning Engineer, Patchstack
Rashmi, a passionate researcher at the MIT CSAIL and machine learning engineer at Patchstack, is dedicated to crafting beautiful AI applications. With nearly 5 years of industrial experience, she has brought ideas to life at pre-seed startups and contributed to impactful redesigns... Read More →
Thursday September 19, 2024 4:35pm - 5:00pm PDT
Festival Pavilion - Breakout Room A
  Breakout Sessions
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Audience
  • Slides Attached
  • Timezone

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -