Emin Temiz
PRO
etemiz
44
followers
·
12 following
AI & ML interests
Alignment
Recent Activity
reacted
to
Kseniase 's
post
with ❤️
about 9 hours ago
6 Free resources on Reinforcement Learning (RL)
RL now is where the real action is, it's the engine behind autonomous tech, robots, and the next wave of AI that thinks, moves and solves problems on its own. To stay up to date with what’s happening in RL, we offer some fresh materials on it:
1. "Reinforcement Learning from Human Feedback" by Nathan Lambert -> https://rlhfbook.com/
It's a short introduction to RLHF, explaining instruction tuning, reward modeling, alignment methods, synthetic data, evaluation, and more
2. "A Course in Reinforcement Learning (2nd Edition)" by Dimitri P. Bertsekas -> https://www.mit.edu/~dimitrib/RLbook.html
Explains dynamic programming (DP) and RL, diving into rollout algorithms, neural networks, policy learning, etc. It’s packed with solved exercises and real-world examples
3. "Mathematical Foundations of Reinforcement Learning" video course by Shiyu Zhao -> https://www.youtube.com/playlist?list=PLEhdbSEZZbDaFWPX4gehhwB9vJZJ1DNm8
Offers a mathematical yet friendly introduction to RL, covering Bellman Equation, value iteration, Monte Carlo learning, approximation, policy gradient, actor-critic methods, etc.
+ Check out the repo for more: https://github.com/MathFoundationRL/Book-Mathematical-Foundation-of-Reinforcement-Learning
4. "Multi-Agent Reinforcement Learning" by Stefano V. Albrecht, Filippos Christianos, and Lukas Schäfer -> https://www.marl-book.com/
Covers models, core ideas of multi-agent RL (MARL) and modern approaches to combining it with deep learning
5. "Reinforcement Learning: A Comprehensive Overview" by Kevin P. Murphy -> https://arxiv.org/pdf/2412.05265
Explains RL and sequential decision making, covering value-based, policy-gradient, model-based, multi-agent RL methods, RL+LLMs, and RL+inference and other topics
6. Our collection of free courses and books on RL -> https://huggingface.co./posts/Kseniase/884818121094439
If you liked this, also subscribe to The Turing Post: https://www.turingpost.com/subscribe
View all activity
Organizations
None yet
view post
According to the paper below, when you fine tune a model with harmful code, it turns evil in other areas. https://arxiv.org/abs/2502.17424 This may be good news because now turning a model to be beneficial might be easier: https://x.com/ESYudkowsky/status/1894453376215388644 Does this mean evil and good are a single direction just like censorship is a single direction? So in theory one can make a model good doing an abliteration like operation?
See translation
Benchmarking Human Alignment of Grok 3