Loading…
Attending this event?
September 18-19, 2024
San Francisco, California
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC-7). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Thursday September 19, 2024 4:05pm - 4:15pm PDT
Scaled dot product attention provides significant acceleration of the transformer layer through fusion of the multihead attention layer. There are several different algorithms to achieve this but tiled attention through scaled dot product attention via Flash Attention is a very popular approach. In PyTorch on the ROCm platform this is currently achieved through ahead of time compiled (AOT) Triton kernels in a linkable archive. AMD’s work to enable and package these kernels is done through AOTriton, which aims to use Triton’s compiler and GPU kernels for faster development. AOTriton maintains an optimized set of tiling sizes and other parameters to provide optimized, pre-compiled Triton kernels. The differences between JIT and AOT are few but are very important. Despite this, prototyping kernels in Triton is much faster than template-based C++ libraries. In this presentation we will go into detail on the interaction layer between PyTorch and AOTriton, the structure of AOTriton and how to add new triton kernels to AOTriton.
Speakers
avatar for Joseph Groenenboom

Joseph Groenenboom

Senior Member of Technical Staff, AMD
Joseph Groenenboom is an engineer with a wide variety of experience across the industry. He started his career at IBM working on enterprise hypervisor technology. Later he moved onto a stint in embedded engineering for both platform, embedded Linux and software engineering at Garmin... Read More →
Thursday September 19, 2024 4:05pm - 4:15pm PDT
Room A

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link