网易首页
29. Advanced topics in imitation and safety (Finn) - 2
2年前 1371观看
加州大学伯克利分校 2017 深度增强学习课程
大学课程 / 社会学
https://www.youtube.com/playlist?list=PLkFD6_40KJIwTmSbCv9OVJB3YaO4sFwkX CS294-112 Deep Reinforcement Learning Sp17 课程主页:http://rll.berkeley.edu/deeprlcourse/
共57集
7.3万人观看
1
Introduction and course overview (Levine, Finn, Schulman) - 1
26:11
2
Introduction and course overview (Levine, Finn, Schulman) - 2
26:14
3
Introduction and course overview (Levine, Finn, Schulman) - 3
26:08
4
Supervised learning and decision making (Levine) - 1
24:06
5
Supervised learning and decision making (Levine) - 2
24:07
6
Supervised learning and decision making (Levine) - 3
24:03
7
Optimal control and planning (Levine) - 1
21:06
8
Optimal control and planning (Levine) - 2
21:13
9
Optimal control and planning (Levine) - 3
21:03
10
Learning dynamical system models from data (Levine) - 1
27:27
11
Learning dynamical system models from data (Levine) - 2
27:35
12
Learning dynamical system models from data (Levine) - 3
27:22
13
Learning policies by imitating optimal controllers (Levine) - 1
23:05
14
Learning policies by imitating optimal controllers (Levine) - 2
23:08
15
Learning policies by imitating optimal controllers (Levine) - 3
22:58
16
RL definitions, value iteration, policy iteration (Schulman) - 1
17:19
17
RL definitions, value iteration, policy iteration (Schulman) - 2
17:22
18
RL definitions, value iteration, policy iteration (Schulman) - 3
17:18
19
Reinforcement learning with policy gradients (Schulman) - 1
21:48
20
Reinforcement learning with policy gradients (Schulman) - 2
21:54
21
Reinforcement learning with policy gradients (Schulman) - 3
21:42
22
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 1
25:50
23
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 2
25:53
24
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 3
25:42
25
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 1
26:47
26
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 2
26:55
27
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 3
26:41
28
Advanced topics in imitation and safety (Finn) - 1
27:53
29
Advanced topics in imitation and safety (Finn) - 2
27:56
30
Advanced topics in imitation and safety (Finn) - 3
27:47
31
Inverse RL: acquiring objectives from demonstration (Finn) - 1
24:47
32
Inverse RL: acquiring objectives from demonstration (Finn) - 2
24:48
33
Inverse RL: acquiring objectives from demonstration (Finn) - 3
24:47
34
Advanced policy gradients: natural gradient and TRPO (Schulman) - 1
28:05
35
Advanced policy gradients: natural gradient and TRPO (Schulman) - 2
28:08
36
Advanced policy gradients: natural gradient and TRPO (Schulman) - 3
28:02
37
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 1
26:55
38
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 2
27:00
39
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 3
26:51
40
Summary of policy gradients and temporal difference methods (Schulman) - 1
24:06
41
Summary of policy gradients and temporal difference methods (Schulman) - 2
24:10
42
Summary of policy gradients and temporal difference methods (Schulman) - 3
23:59
43
The exploration problem (Schulman) - 1
27:18
44
The exploration problem (Schulman) - 2
27:18
45
The exploration problem (Schulman) - 3
27:17
46
Parallel RL algorithms, open problems and challenges in deep reinforcement - 1
26:14
47
Parallel RL algorithms, open problems and challenges in deep reinforcement - 2
26:22
48
Parallel RL algorithms, open problems and challenges in deep reinforcement - 3
26:11
49
Transfer in Reinforcement Learning (Finn) - 1
28:18
50
Transfer in Reinforcement Learning (Finn) - 2
28:18
51
Transfer in Reinforcement Learning (Finn) - 3
28:16
52
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 1
25:24
53
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 2
25:29
54
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 3
25:17
55
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 1
25:39
56
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 2
25:40
57
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 3
25:33
相关视频
00:55
国务院国资委:“十四五”以来,中央企业累计上缴税费超过10万亿元
轻知识
2天前
1083观看
第71/106集 · 14:33
函数的极值与最大值最小值 - 3
大学课程
2022年11月15日
694观看
09:13
揭开公司面纱-3 - 3
轻知识
2022年11月5日
1098观看
第92/147集 · 16:51
布尔表达式的翻译(1) - 2
大学课程
2022年11月2日
741观看
12:43
_06_01.【2020年新教材】工程法规第01次课程陈老师(高清版)_第01节课(法律体系、法人制度、代理制度) - 3
轻知识
2022年11月2日
1453观看
01:30
太有意思了,这花生米我能玩一天!
轻知识
1年前
1295观看
01:43
爱吃辣的人有福了!
轻知识
1月前
541观看
08:12
复制“中国模式”,亚洲穷国如何崛起?中国与孟加拉的前世今生
轻知识
5月前
876观看
06:27
男人无意中救了一只神马,为了报恩,神马幻化成了一只小驴
轻知识
2023年3月12日
778观看
02:49
把红薯放入绞肉机里,出锅就是一道好吃的美食,做法第一次见,香
轻知识
1年前
698观看
01:32
早餐没时间做饭,就做这懒人面食,像煮面一样简单,比面包还松软
轻知识
2023年5月8日
1045观看
04:00
山东的辣椒炒肉为啥那么好吃?原来鲁菜师傅是这样做,难怪这么香
轻知识
11月前
1100观看
00:33
我们的器官喜欢什么食物?看看你知道几种
轻知识
1年前
815观看
05:23
[上册]24.3 正多边形和圆(2) - 3
2022年10月7日
2978观看
09:51
巴西21集:逛矿业和金属博物馆24.03.24 #邱黎 #我去看世界 #巴西
轻知识
11月前
1051观看
第11/96集 · 06:53
【飞行力学】推力与重力 - 1
大学课程
2022年10月14日
3631观看