Loading…
Attending this event?
September 18-19, 2024
San Francisco, California
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC-7). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Wednesday September 18, 2024 1:55pm - 2:20pm PDT
Open source generative models like OpenGPT2, BLOOM, and others have been pivotal in advancing AI technology. These models leverage extensive text data to achieve advanced linguistic capabilities. However, the trend towards proprietary tools and closed large language models is growing, posing unique challenges in open-source AI development. This discussion will explore the intricacies of training such models, the hurdles in dataset management, and the regulation of open-source contributions. We'll explore how to effectively iterate on collected data, prepare for extensive training sessions, and coordinate research across large open-source organizations. We will discuss the challenges of generative models in three different modalities: text, image, and genomics. The talk will draw from the speaker’s personal experience on working on OpenWebText, OpenGPT2, BLOOM, CommonCanvas, Caduceus, and other generative models. We will also cover the changing AI environment and how the future of open souce is threatened by onerous regulation, ever increasing compute costs, and the commoditization of previously open data.
Speakers
avatar for Aaron Gokaslan

Aaron Gokaslan

PhD Student, Cornell University
Aaron Gokaslan has worked on many popular generative models and datasets such as OpenWebText, CommonCanvas, BLOOM, DBRX, and Caduceus, collectively downloaded millions of times. His work on open source has earned him a Community Contributor Award at PyTorch Con and recognition from... Read More →
Wednesday September 18, 2024 1:55pm - 2:20pm PDT
Room C

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link