Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching
HTML-код
- Опубликовано: 13 сен 2024
- We propose an offline MBRL algorithm of Score-Guided Planning (SGP), which is designed for graident-based planning under learned dynamics, and successfully penalizes uncertainty in high dimensions.