网易首页
46. Parallel RL algorithms, open problems and challenges in deep reinforcement - 1
2年前 1485观看
加州大学伯克利分校 2017 深度增强学习课程
大学课程 / 社会学
https://www.youtube.com/playlist?list=PLkFD6_40KJIwTmSbCv9OVJB3YaO4sFwkX CS294-112 Deep Reinforcement Learning Sp17 课程主页:http://rll.berkeley.edu/deeprlcourse/
共57集
7.3万人观看
1
Introduction and course overview (Levine, Finn, Schulman) - 1
26:11
2
Introduction and course overview (Levine, Finn, Schulman) - 2
26:14
3
Introduction and course overview (Levine, Finn, Schulman) - 3
26:08
4
Supervised learning and decision making (Levine) - 1
24:06
5
Supervised learning and decision making (Levine) - 2
24:07
6
Supervised learning and decision making (Levine) - 3
24:03
7
Optimal control and planning (Levine) - 1
21:06
8
Optimal control and planning (Levine) - 2
21:13
9
Optimal control and planning (Levine) - 3
21:03
10
Learning dynamical system models from data (Levine) - 1
27:27
11
Learning dynamical system models from data (Levine) - 2
27:35
12
Learning dynamical system models from data (Levine) - 3
27:22
13
Learning policies by imitating optimal controllers (Levine) - 1
23:05
14
Learning policies by imitating optimal controllers (Levine) - 2
23:08
15
Learning policies by imitating optimal controllers (Levine) - 3
22:58
16
RL definitions, value iteration, policy iteration (Schulman) - 1
17:19
17
RL definitions, value iteration, policy iteration (Schulman) - 2
17:22
18
RL definitions, value iteration, policy iteration (Schulman) - 3
17:18
19
Reinforcement learning with policy gradients (Schulman) - 1
21:48
20
Reinforcement learning with policy gradients (Schulman) - 2
21:54
21
Reinforcement learning with policy gradients (Schulman) - 3
21:42
22
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 1
25:50
23
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 2
25:53
24
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 3
25:42
25
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 1
26:47
26
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 2
26:55
27
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 3
26:41
28
Advanced topics in imitation and safety (Finn) - 1
27:53
29
Advanced topics in imitation and safety (Finn) - 2
27:56
30
Advanced topics in imitation and safety (Finn) - 3
27:47
31
Inverse RL: acquiring objectives from demonstration (Finn) - 1
24:47
32
Inverse RL: acquiring objectives from demonstration (Finn) - 2
24:48
33
Inverse RL: acquiring objectives from demonstration (Finn) - 3
24:47
34
Advanced policy gradients: natural gradient and TRPO (Schulman) - 1
28:05
35
Advanced policy gradients: natural gradient and TRPO (Schulman) - 2
28:08
36
Advanced policy gradients: natural gradient and TRPO (Schulman) - 3
28:02
37
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 1
26:55
38
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 2
27:00
39
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 3
26:51
40
Summary of policy gradients and temporal difference methods (Schulman) - 1
24:06
41
Summary of policy gradients and temporal difference methods (Schulman) - 2
24:10
42
Summary of policy gradients and temporal difference methods (Schulman) - 3
23:59
43
The exploration problem (Schulman) - 1
27:18
44
The exploration problem (Schulman) - 2
27:18
45
The exploration problem (Schulman) - 3
27:17
46
Parallel RL algorithms, open problems and challenges in deep reinforcement - 1
26:14
47
Parallel RL algorithms, open problems and challenges in deep reinforcement - 2
26:22
48
Parallel RL algorithms, open problems and challenges in deep reinforcement - 3
26:11
49
Transfer in Reinforcement Learning (Finn) - 1
28:18
50
Transfer in Reinforcement Learning (Finn) - 2
28:18
51
Transfer in Reinforcement Learning (Finn) - 3
28:16
52
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 1
25:24
53
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 2
25:29
54
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 3
25:17
55
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 1
25:39
56
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 2
25:40
57
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 3
25:33
相关视频
01:37
近些年为何世界越来越关注孙子兵法?
轻知识
9月前
3795观看
01:01
HKC G27M7也学过兵法?
轻知识
3月前
1417观看
07:30
外卖战场,孙子兵法
轻知识
5月前
758观看
02:59
鬼谷子谋略:为什么你总吃亏?学会这三招攻心术,没人再敢小看你
轻知识
2023年8月8日
1833观看
04:49
第2集 孙子兵法—孙武
轻知识
1年前
1320观看
14:43
孙膑得鬼谷子真传兵法,仅用一锅粥,颠覆了整个战国时局
轻知识
1年前
2014观看
04:17
鬼谷子谋略:遇到小人别害怕,教你三大狠招,从此远离麻烦
轻知识
2023年7月15日
1343观看
11:57
柜藏兵法_商道(已改为《对手》)_视频_央视网 - 1
轻知识
2022年11月2日
1958观看
03:56
蒋纬国说:父亲对待敌人的办法,比《孙子兵法》更高明
轻知识
1年前
4003观看
29:20
曾仕强点评三国之道第四部20DVD-18 - 1
2022年10月27日
4595观看
06:29
第1集 韬略鼻祖、兵家之宗——吕尚(姜子牙)
轻知识
1年前
4421观看
03:47
率兵3万打爆20万敌人,“兵圣”孙子到底有多厉害
轻知识
1年前
4391观看
第1/35集 · 04:41
第1讲:《鬼谷子》是一本什么样的书?
大学课程
2022年10月19日
2.5万观看
07:33
诸葛亮奇门遁甲教学:清玉老师教你实战《奇门遁甲》正统布阵之法阳遁布局第一集 - 1
2022年11月4日
1.9万观看
03:08
素书精读—案例:战国第一义女—如姬
轻知识
1年前
741观看
02:22
《三国演义》挑错之“赤壁之战”,诸葛亮的作用其实非常小
轻知识
1年前
801观看