Talk "Deep Reinforcement Learning in the Real World: From Chip Design to LLMs"

Thursday 25 April 2024

This event has finished.

Started 14:00 PM

Finished 15:00 PM

Organized by ODSC Lisbon Data Science

Venue: Online/Virtual

Address: Online event on your device
Portugal

See other Online events

Copy this link to share the event with anyone:

Share to social media:

About this event

**To access this session, please register here: [https://hubs.li/Q02rF3Gl0](https://hubs.li/Q02rF3Gl0)**

**Topic:** “Deep Reinforcement Learning in the Real World: From Chip Design to LLMs”

**Speaker:** Anna Goldie, Senior Staff Research Scientist at Google DeepMind

*Anna works on Large Language Model (LLM) research in Gemini & Bard. Previously, she worked on RL for LLMs and retrieval-augmented LLMs at Anthropic and was co-founder/lead of the ML for Systems team in Google Brain. Her RL methods have been used in multiple generations of Google’s flagship AI accelerator (TPU). She graduated from MIT with a Bachelors in Computer Science, a Bachelors in Linguistics, and a Master of Computer Science, and is a CS PhD Candidate in the Stanford NLP Group. She has published peer-reviewed articles in top scientific venues, including Nature, NeurIPS, ICLR, EMNLP, ISPD, ASPLOS, and MLCAD. She was named one of MIT Technology Review’s 35 Innovators Under 35, and her work has been covered in various media outlets, including CNBC, IBTimes, IEEE Spectrum, MIT Technology Review, WIRED. and ABC News.*

**Abstract:**

Reinforcement learning (RL) is famously powerful but difficult to wield, and until recently, had demonstrated impressive results on games, but little real world impact. I will start the talk with a discussion of RL for Large Language Models (LLMs), including scalable supervision techniques to better align models with human preferences (Constitutional AI / RLAIF). Next, I will discuss RL for chip floorplanning, one of the first examples of RL solving a real world engineering problem. This learning-based method can generate placements that are superhuman or comparable on modern accelerator chips in a matter of hours, whereas the strongest baselines require human experts in the loop and can take several weeks. This method was published in Nature and used in production to generate superhuman chip layouts for the last four generations of Google’s flagship AI accelerator (TPU).

**Hybrid ODSC East 2024 on 23rd-25th April — [https://hubs.li/Q027_nYw0](https://hubs.li/Q027_nYw0)**

**Use COMMUNITY-EAST2024 — code for extra discount on any pass of your choice.**

**ODSC Links:**

• Get free access to more talks/trainings like this at Ai+ Training platform:

[https://hubs.li/H0Zycsf0](https://hubs.li/H0Zycsf0)

• ODSC blog: [https://opendatascience.com/](https://opendatascience.com/)

• Facebook: [https://www.facebook.com/OPENDATASCI](https://www.facebook.com/OPENDATASCI)

• Twitter: [https://twitter.com/_ODSC](https://twitter.com/_ODSC) & @odsc

• LinkedIn: [https://www.linkedin.com/company/open-data-science](https://www.linkedin.com/company/open-data-science)

• Slack Channel: [https://hubs.li/Q02r9VZM0](https://hubs.li/Q02r9VZM0)

• Code of conduct: [https://odsc.com/code-of-conduct/](https://odsc.com/code-of-conduct/)

This page last updated Saturday 20 April 2024 at 21:31.

Problems? Report an error or inappropriate listing here.

Information displayed here is provided in good faith but we are not responsible for the content of any listing. Sometimes events can be cancelled or changed at short notice. Please check with the venue or organizer before you travel!