Alyona Vert's picture

41

Alyona Vert

alyona0l

·

AI & ML interests

None yet

Recent Activity

reacted to Kseniase's post with 👀 about 4 hours ago

6 Free resources on Reinforcement Learning (RL) RL now is where the real action is, it's the engine behind autonomous tech, robots, and the next wave of AI that thinks, moves and solves problems on its own. To stay up to date with what’s happening in RL, we offer some fresh materials on it: 1. "Reinforcement Learning from Human Feedback" by Nathan Lambert -> https://rlhfbook.com/ It's a short introduction to RLHF, explaining instruction tuning, reward modeling, alignment methods, synthetic data, evaluation, and more 2. "A Course in Reinforcement Learning (2nd Edition)" by Dimitri P. Bertsekas -> https://www.mit.edu/~dimitrib/RLbook.html Explains dynamic programming (DP) and RL, diving into rollout algorithms, neural networks, policy learning, etc. It’s packed with solved exercises and real-world examples 3. "Mathematical Foundations of Reinforcement Learning" video course by Shiyu Zhao -> https://www.youtube.com/playlist?list=PLEhdbSEZZbDaFWPX4gehhwB9vJZJ1DNm8 Offers a mathematical yet friendly introduction to RL, covering Bellman Equation, value iteration, Monte Carlo learning, approximation, policy gradient, actor-critic methods, etc. + Check out the repo for more: https://github.com/MathFoundationRL/Book-Mathematical-Foundation-of-Reinforcement-Learning 4. "Multi-Agent Reinforcement Learning" by Stefano V. Albrecht, Filippos Christianos, and Lukas Schäfer -> https://www.marl-book.com/ Covers models, core ideas of multi-agent RL (MARL) and modern approaches to combining it with deep learning 5. "Reinforcement Learning: A Comprehensive Overview" by Kevin P. Murphy -> https://arxiv.org/pdf/2412.05265 Explains RL and sequential decision making, covering value-based, policy-gradient, model-based, multi-agent RL methods, RL+LLMs, and RL+inference and other topics 6. Our collection of free courses and books on RL -> https://huggingface.co./posts/Kseniase/884818121094439 If you liked this, also subscribe to The Turing Post: https://www.turingpost.com/subscribe

upvoted an article about 4 hours ago

What is MoE 2.0? Update Your Knowledge about Mixture-of-experts

published an article about 24 hours ago

What is MoE 2.0? Update Your Knowledge about Mixture-of-experts

View all activity

Organizations

Articles 12

Article

2

What is MoE 2.0? Update Your Knowledge about Mixture-of-experts

Article

14

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

View all Articles

models 0

None public yet

datasets 0

None public yet