Lightning Talk: TorchRL - RLHF Support - Vincent Moens, Meta
HTML-код
- Опубликовано: 10 фев 2025
- Lightning Talk: TorchRL - RLHF Support - Vincent Moens, Meta
RLHF is notoriously hard to implement, requiring technical knowledge across RL and other domains. For this reason, people often revert to packaged solutions with single entry points and complex configurations that leave little room for custom development. We present a new RLHF support in TorchRL that solves this problem by giving developers users full control over the training pipeline at a reduced development cost on the RL side. This new set of primitives allow users to quickly prototype and train generative models across domains (language, CV and others). With the TorchRL-HF tooling, RL-specific classes and recipes are easily blended within one's code base, and multiple solutions (preprocessing techniques or RL algorithms) can seamlessly be implemented without the need for an in-depth understanding of the RL machinery. We demonstrate how this works in practice with examples from diverse domains, including LLMs and drug design.