网易首页
56. Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 2
2年前 544观看
加州大学伯克利分校 2017 深度增强学习课程
大学课程 / 社会学
https://www.youtube.com/playlist?list=PLkFD6_40KJIwTmSbCv9OVJB3YaO4sFwkX CS294-112 Deep Reinforcement Learning Sp17 课程主页:http://rll.berkeley.edu/deeprlcourse/
共57集
7.3万人观看
1
Introduction and course overview (Levine, Finn, Schulman) - 1
26:11
2
Introduction and course overview (Levine, Finn, Schulman) - 2
26:14
3
Introduction and course overview (Levine, Finn, Schulman) - 3
26:08
4
Supervised learning and decision making (Levine) - 1
24:06
5
Supervised learning and decision making (Levine) - 2
24:07
6
Supervised learning and decision making (Levine) - 3
24:03
7
Optimal control and planning (Levine) - 1
21:06
8
Optimal control and planning (Levine) - 2
21:13
9
Optimal control and planning (Levine) - 3
21:03
10
Learning dynamical system models from data (Levine) - 1
27:27
11
Learning dynamical system models from data (Levine) - 2
27:35
12
Learning dynamical system models from data (Levine) - 3
27:22
13
Learning policies by imitating optimal controllers (Levine) - 1
23:05
14
Learning policies by imitating optimal controllers (Levine) - 2
23:08
15
Learning policies by imitating optimal controllers (Levine) - 3
22:58
16
RL definitions, value iteration, policy iteration (Schulman) - 1
17:19
17
RL definitions, value iteration, policy iteration (Schulman) - 2
17:22
18
RL definitions, value iteration, policy iteration (Schulman) - 3
17:18
19
Reinforcement learning with policy gradients (Schulman) - 1
21:48
20
Reinforcement learning with policy gradients (Schulman) - 2
21:54
21
Reinforcement learning with policy gradients (Schulman) - 3
21:42
22
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 1
25:50
23
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 2
25:53
24
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 3
25:42
25
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 1
26:47
26
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 2
26:55
27
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 3
26:41
28
Advanced topics in imitation and safety (Finn) - 1
27:53
29
Advanced topics in imitation and safety (Finn) - 2
27:56
30
Advanced topics in imitation and safety (Finn) - 3
27:47
31
Inverse RL: acquiring objectives from demonstration (Finn) - 1
24:47
32
Inverse RL: acquiring objectives from demonstration (Finn) - 2
24:48
33
Inverse RL: acquiring objectives from demonstration (Finn) - 3
24:47
34
Advanced policy gradients: natural gradient and TRPO (Schulman) - 1
28:05
35
Advanced policy gradients: natural gradient and TRPO (Schulman) - 2
28:08
36
Advanced policy gradients: natural gradient and TRPO (Schulman) - 3
28:02
37
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 1
26:55
38
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 2
27:00
39
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 3
26:51
40
Summary of policy gradients and temporal difference methods (Schulman) - 1
24:06
41
Summary of policy gradients and temporal difference methods (Schulman) - 2
24:10
42
Summary of policy gradients and temporal difference methods (Schulman) - 3
23:59
43
The exploration problem (Schulman) - 1
27:18
44
The exploration problem (Schulman) - 2
27:18
45
The exploration problem (Schulman) - 3
27:17
46
Parallel RL algorithms, open problems and challenges in deep reinforcement - 1
26:14
47
Parallel RL algorithms, open problems and challenges in deep reinforcement - 2
26:22
48
Parallel RL algorithms, open problems and challenges in deep reinforcement - 3
26:11
49
Transfer in Reinforcement Learning (Finn) - 1
28:18
50
Transfer in Reinforcement Learning (Finn) - 2
28:18
51
Transfer in Reinforcement Learning (Finn) - 3
28:16
52
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 1
25:24
53
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 2
25:29
54
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 3
25:17
55
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 1
25:39
56
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 2
25:40
57
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 3
25:33
相关视频
03:12
这三个书法字,草书不草书,隶书不隶书,您能认出来吗?
轻知识
1月前
1421观看
04:23
书法之美之《兰亭集序》
轻知识
1年前
1569观看
01:03
比《兰亭序》更珍贵的书法,是唐代行书之冠,赵孟頫才学到三成
轻知识
2023年5月1日
1737观看
01:32
颜真卿的一幅“丑书”,笔法直追兰亭,只有书法博士才能看的懂
轻知识
2023年8月12日
1719观看
00:18
他的隶书在清代誉为“最美”堪称书法奇才。网友:不输《曹全碑》
轻知识
1年前
1502观看
01:26
书法行草书《告》字的写法
轻知识
1年前
1502观看
00:53
晋唐的笔法秘诀,被颜真卿点破,学书法一定要弄懂它 #书法欣赏#颜真卿#笔法秘诀
轻知识
1年前
1270观看
01:01
欧阳询书法《宗》字的写法,上乘的功力,优秀到位!
轻知识
2023年3月1日
925观看
01:38
“兰亭七子”之一的张利安,书法的风格,王风王韵,米芾的气势
轻知识
5月前
1337观看
02:16
秋瑾这位女中豪杰的书法,一丝不苟!小楷有魏晋之风,行书则刚劲
轻知识
9月前
1346观看
02:42
新疆书法大师宋兆敏:篆书雄浑刚劲,笔力遒劲如秦风
轻知识
8月前
1422观看
02:03
王冬龄书法:小楷承魏晋遗风,草书开启创新之旅
轻知识
4月前
1549观看
02:42
毛主席的《兰亭序》是一幅草书,他的书法风格非常潇洒,傲视古今
轻知识
6月前
1159观看
01:11
王羲之“国宝级”行书,失传1000多年,学习晋唐书法,有它就足矣
轻知识
2023年8月27日
2170观看
00:40
这个书法墙,汇集了我最喜欢的祭侄文稿、兰亭序等旷世奇作,包括张旭、孙过庭等大家的草书作品,汇集了中国...
轻知识
1年前
1779观看
01:36
陆启成小楷书法:超越欧楷,融合魏晋与文征明风骨,展现刚劲之美
轻知识
5月前
1198观看