knacker-hues

Reinforcement Learning from Human Feedback

onurkanbkrc – yesterday – 128 points

dang – yesterday
Related. Others?
RLHF Book - https://news.ycombinator.com/item?id=42902936 - Feb 2025 (37 comments)
verdverm – yesterday
Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials
- leggerss – yesterday
  You could say he's also learning from human feedback
klelatti – yesterday
Web version with links, etc:
https://rlhfbook.com/
- dang – yesterday
  Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext.
iisweetheartii – yesterday
[dead]