TorchRL: The Reinforcement Learning and Control library for PyTorch

Lightning Talk: Diffusers: Bringing Cutting-Edge Diffusion Models to the Masses - Lysandre Debut

Lightning Talk: How to Win at Coding Interviews - David Stone - CppCon 2022

BABYMONSTER - 'Love In My Heart' M/V

OUR FIRST 24 HOURS HOME WITH A NEWBORN + HER NAME REVEAL!!

Making Cookies For Santa

Lightning Talk: TorchRL - RLHF Support - Vincent Moens, Meta

PyTorch

Просмотров 605

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 10 фев 2025
Lightning Talk: TorchRL - RLHF Support - Vincent Moens, Meta
RLHF is notoriously hard to implement, requiring technical knowledge across RL and other domains. For this reason, people often revert to packaged solutions with single entry points and complex configurations that leave little room for custom development. We present a new RLHF support in TorchRL that solves this problem by giving developers users full control over the training pipeline at a reduced development cost on the RL side. This new set of primitives allow users to quickly prototype and train generative models across domains (language, CV and others). With the TorchRL-HF tooling, RL-specific classes and recipes are easily blended within one's code base, and multiple solutions (preprocessing techniques or RL algorithms) can seamlessly be implemented without the need for an in-depth understanding of the RL machinery. We demonstrate how this works in practice with examples from diverse domains, including LLMs and drug design.

Комментарии •

Следующие

Автовоспроизведение

TorchRL: The Reinforcement Learning and Control library for PyTorch

TorchRL: The Reinforcement Learning and Control library for PyTorch

Lightning Talk: Diffusers: Bringing Cutting-Edge Diffusion Models to the Masses - Lysandre Debut

Lightning Talk: Diffusers: Bringing Cutting-Edge Diffusion Models to the Masses - Lysandre Debut

Lightning Talk: How to Win at Coding Interviews - David Stone - CppCon 2022

Lightning Talk: How to Win at Coding Interviews - David Stone - CppCon 2022

BABYMONSTER - 'Love In My Heart' M/V

BABYMONSTER - 'Love In My Heart' M/V

OUR FIRST 24 HOURS HOME WITH A NEWBORN + HER NAME REVEAL!!

OUR FIRST 24 HOURS HOME WITH A NEWBORN + HER NAME REVEAL!!

Making Cookies For Santa

Making Cookies For Santa

Tornado touches down in Santa Cruz County, several injured

Tornado touches down in Santa Cruz County, several injured

Stanford CS224N | 2023 | Lecture 10 - Prompting, Reinforcement Learning from Human Feedback

Stanford CS224N | 2023 | Lecture 10 - Prompting, Reinforcement Learning from Human Feedback

Code Less, Create More: Unleashing AI Secret Weapons for Lazy Developers!

Code Less, Create More: Unleashing AI Secret Weapons for Lazy Developers!

DeepSeek V3/R1 - Overview

DeepSeek V3/R1 - Overview

What if all the world's biggest problems have the same solution?

What if all the world's biggest problems have the same solution?

Reinforcement Learning - My Algorithm vs State of the Art

Reinforcement Learning - My Algorithm vs State of the Art

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

PyTorch Expert Exchange Hacker Cup AI

PyTorch Expert Exchange Hacker Cup AI

K8sGPT: Give Your k8s Troubleshooting Skills a Superpower | Conf42 DevOps 2025

K8sGPT: Give Your k8s Troubleshooting Skills a Superpower | Conf42 DevOps 2025

Running State-of-Art Gen AI Models on-Device with NPU Acceleration - Felix Baum, Qualcomm

Running State-of-Art Gen AI Models on-Device with NPU Acceleration - Felix Baum, Qualcomm

Robando dulces bajo el tubo

Robando dulces bajo el tubo

Power of Makeup (Poppy Playtime)

Power of Makeup (Poppy Playtime)

LIVE: Team Vitality vs. Team Spirit - IEM Katowice 2025 - Grand Final

LIVE: Team Vitality vs. Team Spirit - IEM Katowice 2025 - Grand Final

ТЕНИ МЮНХЕНА. БЕСЕДА СО СТАНИСЛАВ БЕЛКОВСКИЙ @BelkovskiyS

ТЕНИ МЮНХЕНА. БЕСЕДА СО СТАНИСЛАВ БЕЛКОВСКИЙ @BelkovskiyS

ЭКСТРЕМАЛЬНЫЕ ЗАДАНИЯ от ДРУЗЕЙ ! **мы больше не общаемся**

ЭКСТРЕМАЛЬНЫЕ ЗАДАНИЯ от ДРУЗЕЙ ! **мы больше не общаемся**

Потап - война, хейт, Настя, новая жизнь / вДудь

Потап – война, хейт, Настя, новая жизнь / вДудь

Insane COOKIE MAGIC Trick 🤯🍪 #shorts @juliet_smxll

Insane COOKIE MAGIC Trick 🤯🍪 #shorts @juliet_smxll