Thank you for making this excellent video about our work! Minor note: at the end you mention 18-minute training time for our instruction-following ReFT, but that number is only for the small 1K subset of Ultrafeedback (last row in table). It takes a couple hours to train on the whole dataset, but we wanted to show that ReFT is also data-efficient through that number.
Thank you for making this excellent video about our work! Minor note: at the end you mention 18-minute training time for our instruction-following ReFT, but that number is only for the small 1K subset of Ultrafeedback (last row in table). It takes a couple hours to train on the whole dataset, but we wanted to show that ReFT is also data-efficient through that number.
Thank you Aryaman for the kind feedback and for the correction 🙏
So interesting! 👀
Thanks