网易首页
48. Parallel RL algorithms, open problems and challenges in deep reinforcement - 3
2023年9月23日 1507观看
加州大学伯克利分校 2017 深度增强学习课程
大学课程 / 社会学
https://www.youtube.com/playlist?list=PLkFD6_40KJIwTmSbCv9OVJB3YaO4sFwkX CS294-112 Deep Reinforcement Learning Sp17 课程主页:http://rll.berkeley.edu/deeprlcourse/
共57集
7.3万人观看
1
Introduction and course overview (Levine, Finn, Schulman) - 1
26:11
2
Introduction and course overview (Levine, Finn, Schulman) - 2
26:14
3
Introduction and course overview (Levine, Finn, Schulman) - 3
26:08
4
Supervised learning and decision making (Levine) - 1
24:06
5
Supervised learning and decision making (Levine) - 2
24:07
6
Supervised learning and decision making (Levine) - 3
24:03
7
Optimal control and planning (Levine) - 1
21:06
8
Optimal control and planning (Levine) - 2
21:13
9
Optimal control and planning (Levine) - 3
21:03
10
Learning dynamical system models from data (Levine) - 1
27:27
11
Learning dynamical system models from data (Levine) - 2
27:35
12
Learning dynamical system models from data (Levine) - 3
27:22
13
Learning policies by imitating optimal controllers (Levine) - 1
23:05
14
Learning policies by imitating optimal controllers (Levine) - 2
23:08
15
Learning policies by imitating optimal controllers (Levine) - 3
22:58
16
RL definitions, value iteration, policy iteration (Schulman) - 1
17:19
17
RL definitions, value iteration, policy iteration (Schulman) - 2
17:22
18
RL definitions, value iteration, policy iteration (Schulman) - 3
17:18
19
Reinforcement learning with policy gradients (Schulman) - 1
21:48
20
Reinforcement learning with policy gradients (Schulman) - 2
21:54
21
Reinforcement learning with policy gradients (Schulman) - 3
21:42
22
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 1
25:50
23
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 2
25:53
24
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 3
25:42
25
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 1
26:47
26
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 2
26:55
27
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 3
26:41
28
Advanced topics in imitation and safety (Finn) - 1
27:53
29
Advanced topics in imitation and safety (Finn) - 2
27:56
30
Advanced topics in imitation and safety (Finn) - 3
27:47
31
Inverse RL: acquiring objectives from demonstration (Finn) - 1
24:47
32
Inverse RL: acquiring objectives from demonstration (Finn) - 2
24:48
33
Inverse RL: acquiring objectives from demonstration (Finn) - 3
24:47
34
Advanced policy gradients: natural gradient and TRPO (Schulman) - 1
28:05
35
Advanced policy gradients: natural gradient and TRPO (Schulman) - 2
28:08
36
Advanced policy gradients: natural gradient and TRPO (Schulman) - 3
28:02
37
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 1
26:55
38
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 2
27:00
39
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 3
26:51
40
Summary of policy gradients and temporal difference methods (Schulman) - 1
24:06
41
Summary of policy gradients and temporal difference methods (Schulman) - 2
24:10
42
Summary of policy gradients and temporal difference methods (Schulman) - 3
23:59
43
The exploration problem (Schulman) - 1
27:18
44
The exploration problem (Schulman) - 2
27:18
45
The exploration problem (Schulman) - 3
27:17
46
Parallel RL algorithms, open problems and challenges in deep reinforcement - 1
26:14
47
Parallel RL algorithms, open problems and challenges in deep reinforcement - 2
26:22
48
Parallel RL algorithms, open problems and challenges in deep reinforcement - 3
26:11
49
Transfer in Reinforcement Learning (Finn) - 1
28:18
50
Transfer in Reinforcement Learning (Finn) - 2
28:18
51
Transfer in Reinforcement Learning (Finn) - 3
28:16
52
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 1
25:24
53
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 2
25:29
54
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 3
25:17
55
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 1
25:39
56
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 2
25:40
57
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 3
25:33
相关视频
27:37
第十一讲 文言文阅读秋季精讲三:议论说理类文言文 - 3
轻知识
2022年10月27日
1715观看
第54/304集 · 20:21
公文写作规范-写作 - 2
大学课程
2022年11月16日
1430观看
13:06
C语言OJ题目作业 - 1
轻知识
2022年11月8日
852观看
29:21
公务员省考-公文写作 -【名师专项点拨-申论】(1) - 1
轻知识
2022年11月6日
5577观看
56:04
年广东省考公务员申论大作文题冲刺 公文写作题举一反三
轻知识
1年前
1451观看
08:54
申论 2020年公务员联考申论应用写作能力—应用文文种逻辑 - 3
2023年8月8日
744观看
第24/63集 · 05:24
文言文字词应该如何教才能更吸引学生? - 1
大学课程
2022年9月19日
2395观看
01:09
高效背诵方法
轻知识
2023年7月14日
1893观看
第2/33集 · 43:25
【2021考研英语-朱伟英语写作句子代练】基础句型写作 - 2
考研留学
2022年11月3日
1637观看
14:50
第6单元 写作-语言简明-部优 - 3
2022年11月16日
1031观看
06:53
【4月合集|EnglishClass101】20节地道英语课|语法提升|听力攻克|流利口语|学术写作-EnglishClass101(7)
轻知识
2022年11月3日
1704观看
07:50
第三单元 课文 口语交际 - 3
2022年10月7日
812观看
第32/39集 · 05:08
活动3.4.1 口语评价题的选、改、编 - 1
大学课程
2022年9月12日
4178观看
07:24
十个最佳方法教你如何自学语言(词汇)
2022年11月6日
1052观看
06:30
第04课-主动词汇训练:培养“英语脑”思维,像母语一样学口语 - 3
轻知识
2022年9月29日
8452观看
02:52
文章写的很无趣?这些写作词汇使用技巧你必须知道! 片段01
轻知识
2023年8月8日
1022观看