Loading…
Attending this event?
September 18-19, 2024
San Francisco, California
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC-7). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Thursday September 19, 2024 1:32pm - 1:37pm PDT
This talk will cover two new ways IBM has optimized generative AI inferencing with PyTorch: speculative decoding and Triton kernel development. Speculative decoding leverages predictive modeling to reduce latency by anticipating potential outputs, streamlining the inference process without sacrificing accuracy. IBM Research's team developed new speculative architectures and open sourced speculators for LLama3 models. It will also discuss various Triton kernels to accelerate inference, one of which was contributed to vLLM for accelerating MoE models. Finally, it will share a glimpse of IBM's AI hardware work, including how the IBM Artificial Intelligence Unit (AIU) could integrate into the PyTorch stack.
Speakers
avatar for Mudhakar Srivatsa

Mudhakar Srivatsa

Distinguished Engineer, IBM Research
Mudhakar Srivatsa is a distinguished research staff member at the Distributed Cloud department in IBM T. J. Watson Research Center. His work is focussed on heterogeneous spatiotemporal data with applications to edge computing, AIOps and Hybrid AI Scaling. He is an IBM master inv... Read More →
Thursday September 19, 2024 1:32pm - 1:37pm PDT
Festival Pavilion - Keynote Room
Log in to leave feedback.

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link