Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching

Поделиться
HTML-код
  • Опубликовано: 13 сен 2024
  • We propose an offline MBRL algorithm of Score-Guided Planning (SGP), which is designed for graident-based planning under learned dynamics, and successfully penalizes uncertainty in high dimensions.

Комментарии •