强化学习从入门到放弃的资料

[Copy link]
see4209 | reply20 | 2022-1-5 07:21:22 | 显示全部楼层 |Reading mode
update 2018-11-10:


  • 加入OpenAIOfspinningup
  • 加入台湾大学李宏毅的课
  • 加入 UCL 汪军老师 And SJTU 张伟楠 老师 stay SJTU 做的 Multi-Agent Reinforcement Learning Tutorial
  • update UCB And CMUOfDRL课到2018 fall 5. update Sutton 的书到 final version

和一些同学,还有对RL感兴趣的人聊天时,发现他们对于RL很感兴趣,却不知道怎么学习。其中有很大部分的原因是不知道学习资料在哪里寻找,我这里列举我一些我觉得比较好的学习资料与书籍,后续会一直modify学习资料的,比如将我觉得好的会议slide也加入,感兴趣的同学记得去star/watch github的仓库,知乎更新并不会太快。

github上就是单纯的收集,在知乎这儿,我会稍微对每个资料评论一两句(斗胆评论一下)。

其实这些资料有心在网上应该都能找到,我就先列出了些我觉得好的(其实还有一些没有整理的琐碎的),毕竟现在RL还是国外是主流,国内做的老师都寥寥无几(不像cv,nlp之类的),所以也欢迎真心对RL的同学们互相交流,眼界要开阔~~~

不过是否更新就看心情了~~~毕竟开了好多坑,比如MARL的入门(multiagent reinforcement),sc2的教程(星际争霸二的reinforcement leanring)等等,挖坑要填啊~~



  • [Reinforcement Learning: An Introduction](#Reinforcement Learning: An Introduction )
  • [Algorithms for Reinforcement Learning](#Algorithms for Reinforcement Learning)
  • OpenAI-spinningup
  • 课程
  • 基础课程

    • [Rich Sutton 强化学习课程(Alberta)](#Rich Sutton 强化学习课程(Alberta))
    • [David Silver 强化学习课程(UCL)](#David Silver 强化学习课程(UCL))
    • [Stanford 强化学习课程](#Stanford 强化学习课程)
    • [UCL + STJU Multi-Agent Reinforcement Learning Tutorial](#Multi-Agent Reinforcement Learning Tutorial)



  • depthDRL课程

    • [台湾大学 李宏毅 (深度)强化学习](#台湾大学 李宏毅 (深度)强化学习)
    • [UCB 深度强化学习课程](#UCB 深度强化学习课程)
    • [CMU 深度强化学习课程](#CMU 深度强化学习课程)




Reinforcement Learning: An Introduction

Richard Sutton and Andrew Barto, Reinforcement Learning: An Introduction update 第二版的最终版(点击obline draft): link,因为官方的是放在google doc上,所以我就下载了一个放在github上,需要自取 link
注:已经可以准备买实体书了,和同学各自海淘了一本,还没有到手 -- 国外亚马逊, 国内的话,可以考虑JD和国内的亚马逊--不过会贵一些
Algorithms for Reinforcement Learning

Csaba Szepesvari, Algorithms for Reinforcement Learning link
OpenAI-spinningup

这个算是比较杂的书吧,有在线doc+对应的code+对应的练习(非常建议结合UCL的一起看,我大致过了一遍,蛮不错的。 * 但是没有提到下面的UCL,UCB的课,也没有提到上面sutton的书,结合得看或许会更好 * 在线的文档 link 关于强化学习的基础介绍 link 关于深度强化学习的建议 link 代码部分 link
课程

基础课程

Rich Sutton 强化学习课程(Alberta)

课程主页 link
这个比较老了,有一个比较新的在google云盘上,我找个时间整理一下。

简单地介绍一下,suttonyesRL领域的大牛(上面那本书的作者),我和他发过邮件,大牛也有回我(哈哈哈哈哈哈,拜见祖师爷),感觉蛮和蔼的。但是说实话还是silverOfslide更适合入门,感觉写的很好。

八卦一下,就是David Silver,还有DQN的一作,Aja Huang(就是代替alpha go下棋的)等等一大部分RL领域的中坚力量都与Alberta有千丝万缕的关系,所以他们的slide感觉蛮像的。
David Silver 强化学习课程(UCL)

注:这是David Silver大神2015stayUCL开的课,现在感觉已经在DeepMind走向巅峰了,估计得等他那天想回学校培养学生才可能开出新的课吧。非常推荐入门学习,建立基础的RL概念。 课程主页:link
对应slide(课件): Lecture 1: Introduction to Reinforcement Learning link
Lecture 2: Markov Decision Processes link
Lecture 3: Planning by Dynamic Programming link
Lecture 4: Model-Free Prediction link
Lecture 5: Model-Free Control link
Lecture 6: Value Function Approximation link
Lecture 7: Policy Gradient Methods link
Lecture 8: Integrating Learning and Planning link
Lecture 9: Exploration and Exploitation link
Lecture 10: Case Study: RL in Classic Games link
Stanford 强化学习课程

也适合入门吧,我对搜课的人不太了解,可能也是RL大牛吧,毕竟我的重心在DRLAndMAS上。不过可以当成UCL的课程的补充来看。

注:为2018 spring的课 课程主页: link
对应slide(课件): Introduction to Reinforcement Learning link
How to act given know how the world works. Tabular setting. Markov processes. Policy search. Policy iteration. Value iteration link
Learning to evaluate a policy when don't know how the world works. link
Model-free learning to make good decisions. Q-learning. SARSA. link
Scaling up: value function approximation. Deep Q Learning. link
Deep reinforcement learning continued. link
Imitation Learning. link
Policy search. link
Policy search. link
Midterm review. link
Fast reinforcement learning (Exploration/Exploitation) Part I. link
Fast reinforcement learning (Exploration/Exploitation) Part II. link
Batch Reinforcement Learning. link
Monte Carlo Tree Search. link
Human in the loop RL with a focus on transfer learing. link
Multi-Agent Reinforcement Learning Tutorial

注:因为在阿里广告这边实习,有幸和汪老师还有张老师做了篇论文。在过程中体会到汪老师的思维真的很活跃,很强。另外,张老师感觉是国内cs冉冉升起的新星,值得follow和关注!
课程主页 link
Fundamentals of Reinforcement Learning  link
Fundamentals of Game Theory link
Learning in Repeated Games link
Multi-Agent Reinforcement Learning link link
depthDRL课程

台湾大学 李宏毅 (深度)强化学习

课程主页 [link](http://speech. ee.ntu.edu.tw/~tlkagk/courses/)
视频可以在B站上看到:link
UCB 深度强化学习课程

强力推荐,大牛云集。毕竟和OpenAIandgoogle brain联系很近,所以对于某些算法的解释比论文易懂多了,比如TRPO,PPO那一个slide,看的我神魂颠倒,厉害!(给John Schulman发过邮件,就是TRPO,PPO的一作,让我深刻理解到了random seed的重要性,,,,sad!)

课程主页: link
update:2018 fall(2018年秋季)
对应slide(课件):
Lecture Slides See Syllabus for more information.
Introduction and Course Overview link
Supervised Learning and Imitation link
TensorFlow and Neural Nets Review Session (notebook) link
Reinforcement Learning Introduction link
Policy Gradients Introduction link
Actor-Critic Introduction link
Value Functions and Q-Learning link
Advanced Q-Learning Algorithms link
Advanced Policy Gradients link
Optimal Control and Planning link
Model-Based Reinforcement Learning link
Advanced Model Learning and Images link
Learning Policies by Imitating Other Policies link
Probability and Variational Inference Primer link
Connection between Inference and Control link
Inverse Reinforcement Learning link
Exploration link,link
Transfer Learning and Multi-Task Learning link
Meta-Learning link
Parallelism and RL System Design link
Advanced Imitation Learning and Open Problems link
CMU 深度强化学习课程

supplementUCB的课程吧,相对更衔接UCL的课程
update fall 2018
2018 fall 的课程主页 link 2017的课程主页: link
对应slide(课件): Introduction link
Markov decision processes (MDPs), POMDPs link
Solving known MDPs: Dynamic Programming link
Policy iteration, Value iteration, Asynchronous DP link
Monte Carlo Learning, Temporal difference learning, Q learning link
Temporal difference learning (Tom), Planning and learning: Dyna, Monte carlo tree search link
Deep NN Architectures for RL link
Recitation on Monte Carlo Tree Search link
VF approximation, MC, TD with VF approximation, Control with VF approximationlink
Deep Q Learning : Double Q learning, replay memorylink Policy Gradients link link
Advanced Policy Gradients link
Evolution Methods, Natural Gradients link
Natural Policy Gradients, TRPO, PPO, ACKTR link
Pathwise Derivatives, DDPG, multigoal RL, HER link
Exploration vs. Exploitation link link
Exploration and RL in Animals link link
Model-based Reinforcement Learning link
Imitation Learning link
Maximum Entropy Inverse RL, Adversarial imitation learning link
Recitation: Trajectory optimization - iterative LQR link
Learning to learn, one shot learning[link](Learning to learn, one shot learning)
Translated by the Internet, your translation resource information platform, pay attention to the official account [translation information]-Official account:fanyi899
Zhengzhou Kewei | 2022-1-5 07:22:16 | 显示全部楼层
random seed的重要性怎么说。。。
Translated by the Internet, your translation resource information platform, pay attention to the official account [translation information]-Official account:fanyi899
dugk39529 | 2022-1-5 07:22:45 | 显示全部楼层
work nowork 全靠运气。。
Translated by the Internet, your translation resource information platform, pay attention to the official account [translation information]-Official account:fanyi899
芙蓉花开2017 | 2022-1-5 07:23:11 | 显示全部楼层
。。。这
Translated by the Internet, your translation resource information platform, pay attention to the official account [translation information]-Official account:fanyi899
凡心落幕 | 2022-1-5 07:23:16 | 显示全部楼层
赞一个。跟随大神学习,拉我的小伙伴准备入坑了
Translated by the Internet, your translation resource information platform, pay attention to the official account [translation information]-Official account:fanyi899
猫咪小米轮 | 2022-1-5 07:23:24 | 显示全部楼层
说明部分人人品不好
Translated by the Internet, your translation resource information platform, pay attention to the official account [translation information]-Official account:fanyi899
狐媚坑宠教 | 2022-1-5 07:24:23 | 显示全部楼层
想问下sutton课程的ppt能分享下吗?网页上面下不了
Translated by the Internet, your translation resource information platform, pay attention to the official account [translation information]-Official account:fanyi899
Leo962 | 2022-1-5 07:24:41 | 显示全部楼层
强烈建议把《强化学习实战:强化学习在阿里的技术演进和业务创新》这本新书列上。
Translated by the Internet, your translation resource information platform, pay attention to the official account [translation information]-Official account:fanyi899
惊心幻 | 2022-1-5 07:25:40 | 显示全部楼层
David 的课实在是太偏向理论了
Translated by the Internet, your translation resource information platform, pay attention to the official account [translation information]-Official account:fanyi899
欧阳馨丶 | 2022-1-5 07:26:26 | 显示全部楼层
感谢大神分享,非常有用!对了UCB的只有slides没有视频吗?
Translated by the Internet, your translation resource information platform, pay attention to the official account [translation information]-Official account:fanyi899
You need to log in before you can reply Sign in | Join now Scan and login on wechat

Integral rules of this edition

22

theme

22

Post

78

integral

Registered members

Rank: 2

integral
78