网易首页
4. Supervised learning and decision making (Levine) - 1
2023年9月23日 1873观看
加州大学伯克利分校 2017 深度增强学习课程
大学课程 / 社会学
https://www.youtube.com/playlist?list=PLkFD6_40KJIwTmSbCv9OVJB3YaO4sFwkX CS294-112 Deep Reinforcement Learning Sp17 课程主页:http://rll.berkeley.edu/deeprlcourse/
共57集
7.3万人观看
1
Introduction and course overview (Levine, Finn, Schulman) - 1
26:11
2
Introduction and course overview (Levine, Finn, Schulman) - 2
26:14
3
Introduction and course overview (Levine, Finn, Schulman) - 3
26:08
4
Supervised learning and decision making (Levine) - 1
24:06
5
Supervised learning and decision making (Levine) - 2
24:07
6
Supervised learning and decision making (Levine) - 3
24:03
7
Optimal control and planning (Levine) - 1
21:06
8
Optimal control and planning (Levine) - 2
21:13
9
Optimal control and planning (Levine) - 3
21:03
10
Learning dynamical system models from data (Levine) - 1
27:27
11
Learning dynamical system models from data (Levine) - 2
27:35
12
Learning dynamical system models from data (Levine) - 3
27:22
13
Learning policies by imitating optimal controllers (Levine) - 1
23:05
14
Learning policies by imitating optimal controllers (Levine) - 2
23:08
15
Learning policies by imitating optimal controllers (Levine) - 3
22:58
16
RL definitions, value iteration, policy iteration (Schulman) - 1
17:19
17
RL definitions, value iteration, policy iteration (Schulman) - 2
17:22
18
RL definitions, value iteration, policy iteration (Schulman) - 3
17:18
19
Reinforcement learning with policy gradients (Schulman) - 1
21:48
20
Reinforcement learning with policy gradients (Schulman) - 2
21:54
21
Reinforcement learning with policy gradients (Schulman) - 3
21:42
22
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 1
25:50
23
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 2
25:53
24
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 3
25:42
25
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 1
26:47
26
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 2
26:55
27
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 3
26:41
28
Advanced topics in imitation and safety (Finn) - 1
27:53
29
Advanced topics in imitation and safety (Finn) - 2
27:56
30
Advanced topics in imitation and safety (Finn) - 3
27:47
31
Inverse RL: acquiring objectives from demonstration (Finn) - 1
24:47
32
Inverse RL: acquiring objectives from demonstration (Finn) - 2
24:48
33
Inverse RL: acquiring objectives from demonstration (Finn) - 3
24:47
34
Advanced policy gradients: natural gradient and TRPO (Schulman) - 1
28:05
35
Advanced policy gradients: natural gradient and TRPO (Schulman) - 2
28:08
36
Advanced policy gradients: natural gradient and TRPO (Schulman) - 3
28:02
37
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 1
26:55
38
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 2
27:00
39
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 3
26:51
40
Summary of policy gradients and temporal difference methods (Schulman) - 1
24:06
41
Summary of policy gradients and temporal difference methods (Schulman) - 2
24:10
42
Summary of policy gradients and temporal difference methods (Schulman) - 3
23:59
43
The exploration problem (Schulman) - 1
27:18
44
The exploration problem (Schulman) - 2
27:18
45
The exploration problem (Schulman) - 3
27:17
46
Parallel RL algorithms, open problems and challenges in deep reinforcement - 1
26:14
47
Parallel RL algorithms, open problems and challenges in deep reinforcement - 2
26:22
48
Parallel RL algorithms, open problems and challenges in deep reinforcement - 3
26:11
49
Transfer in Reinforcement Learning (Finn) - 1
28:18
50
Transfer in Reinforcement Learning (Finn) - 2
28:18
51
Transfer in Reinforcement Learning (Finn) - 3
28:16
52
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 1
25:24
53
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 2
25:29
54
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 3
25:17
55
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 1
25:39
56
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 2
25:40
57
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 3
25:33
相关视频
06:51
如何让观众沉浸在你的演讲 - 3
演讲
2022年9月29日
3990观看
00:37
曾美慧孜回忆童年演讲经历
轻知识
11月前
4515观看
01:58
#董丽娜寄语诵读大会 “希望#中华经典诵读大会
轻知识
2023年7月25日
1538观看
01:11
这段即兴演讲的背后,可能是孩子读过的书,听过的话,经过的事
轻知识
1年前
8335观看
03:09
联合国眼中的中国有多强大?中国在联合国的作用,愈加显著!
轻知识
1年前
2186观看
23:46
悲催岳父四个女婿来自四个不同国家,险些组成“联合国”! #搞笑
轻知识
4月前
1467观看
01:19
世纪只有这三个新国家实现独立,被联合国接纳为新成员
轻知识
7月前
1295观看
07:44
联合国大会看起来高大上,实际上相当混乱,不仅动嘴还动手
轻知识
1年前
3281观看
01:14
走进联合国:世界在这里争论,也在这里前进
轻知识
1月前
1082观看
07:48
如何重返联合国?两岸及70国博弈下的2758号决议
轻知识
8月前
1030观看
01:20
根据联合国的评选,中国最值得一游15个村庄
轻知识
11月前
809观看
21:44
韩寒采访集(韩寒演讲) - 2
2022年10月31日
4943观看
07:52
年真实影像,中国恢复联合国席位,乔的笑震碎联合国大厦
轻知识
1年前
2089观看
05:10
年伍修权联合国发言,美台代表被怼得哑口无言,震撼全场
轻知识
1年前
1178观看
02:21
联合国都管不了!铁了心要干掉对方的印巴,到底有哪些历史恩怨?
轻知识
6月前
1096观看
01:59
联合国称加沙地带仅剩下11%的区域供巴民众生存
轻知识
1年前
1306观看