Name: TorchInductor CPU Backend Advancements: New Features and Performance Improvements - Jiong Gong & Leslie Fang, Intel
Start: 2024-09-18T14:55:00-0700
End: 2024-09-18T15:20:00-0700

September 18-19, 2024
San Francisco, California
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC-7). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Wednesday September 18, 2024 2:55pm - 3:20pm PDT

Room B

This presentation provides an update on the latest advancements in the TorchInductor CPU backend since the last conference to bring best-in-class CPU performance for broad DL workloads. We will discuss new features and performance enhancements, including: • Max-autotune support with codegen for GEMMs, boosting performance for GEMM-related operations • Enhanced vectorized codegen support, now covering all data types beyond floating points with flexible vector factors, and optimized loop scheduling • Comprehensive quantization support, including weight-only-quantization (WoQ), and optimizations for dynamic quantization and quantization-aware training • Improved Attention support, featuring attention masks and optimizating SoftMax via flash attention v2 etc. • AOTInductor support, enabling high-performance inference with frozen weights • Native Windows support, with improved vectorization capabilities These advancements, combined with ongoing optimizations, have resulted in significant performance improvements since PyTorch 2.1, demonstrated through extensive benchmarks and large language models (LLMs).

Speakers

Leslie Fang

Software Engineer, Intel

My name is Leslie Fang. I am a software engineer from Intel who works on PyTorch performance optimization on X86 servers for the past 4 years. Currently, I am mainly focusing on the feature domain of Quantization, Autocast, and Inductor CPP/OpenMP backend in Stock PyTorch.

Jiong Gong

Principal Engineer, Intel

Jiong is a software architect from Intel who works on PyTorch framework optimizations. He is the PyTorch module maintainer for CPU and compiler.

Wednesday September 18, 2024 2:55pm - 3:20pm PDT
Room B

Breakout Sessions

Audience Intermediate

PyTorch Conference 2024

Leslie Fang

Jiong Gong

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!