网易首页
5. Supervised learning and decision making (Levine) - 2
2023年9月23日 1242观看
加州大学伯克利分校 2017 深度增强学习课程
大学课程 / 社会学
https://www.youtube.com/playlist?list=PLkFD6_40KJIwTmSbCv9OVJB3YaO4sFwkX CS294-112 Deep Reinforcement Learning Sp17 课程主页:http://rll.berkeley.edu/deeprlcourse/
共57集
7.3万人观看
1
Introduction and course overview (Levine, Finn, Schulman) - 1
26:11
2
Introduction and course overview (Levine, Finn, Schulman) - 2
26:14
3
Introduction and course overview (Levine, Finn, Schulman) - 3
26:08
4
Supervised learning and decision making (Levine) - 1
24:06
5
Supervised learning and decision making (Levine) - 2
24:07
6
Supervised learning and decision making (Levine) - 3
24:03
7
Optimal control and planning (Levine) - 1
21:06
8
Optimal control and planning (Levine) - 2
21:13
9
Optimal control and planning (Levine) - 3
21:03
10
Learning dynamical system models from data (Levine) - 1
27:27
11
Learning dynamical system models from data (Levine) - 2
27:35
12
Learning dynamical system models from data (Levine) - 3
27:22
13
Learning policies by imitating optimal controllers (Levine) - 1
23:05
14
Learning policies by imitating optimal controllers (Levine) - 2
23:08
15
Learning policies by imitating optimal controllers (Levine) - 3
22:58
16
RL definitions, value iteration, policy iteration (Schulman) - 1
17:19
17
RL definitions, value iteration, policy iteration (Schulman) - 2
17:22
18
RL definitions, value iteration, policy iteration (Schulman) - 3
17:18
19
Reinforcement learning with policy gradients (Schulman) - 1
21:48
20
Reinforcement learning with policy gradients (Schulman) - 2
21:54
21
Reinforcement learning with policy gradients (Schulman) - 3
21:42
22
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 1
25:50
23
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 2
25:53
24
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 3
25:42
25
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 1
26:47
26
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 2
26:55
27
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 3
26:41
28
Advanced topics in imitation and safety (Finn) - 1
27:53
29
Advanced topics in imitation and safety (Finn) - 2
27:56
30
Advanced topics in imitation and safety (Finn) - 3
27:47
31
Inverse RL: acquiring objectives from demonstration (Finn) - 1
24:47
32
Inverse RL: acquiring objectives from demonstration (Finn) - 2
24:48
33
Inverse RL: acquiring objectives from demonstration (Finn) - 3
24:47
34
Advanced policy gradients: natural gradient and TRPO (Schulman) - 1
28:05
35
Advanced policy gradients: natural gradient and TRPO (Schulman) - 2
28:08
36
Advanced policy gradients: natural gradient and TRPO (Schulman) - 3
28:02
37
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 1
26:55
38
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 2
27:00
39
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 3
26:51
40
Summary of policy gradients and temporal difference methods (Schulman) - 1
24:06
41
Summary of policy gradients and temporal difference methods (Schulman) - 2
24:10
42
Summary of policy gradients and temporal difference methods (Schulman) - 3
23:59
43
The exploration problem (Schulman) - 1
27:18
44
The exploration problem (Schulman) - 2
27:18
45
The exploration problem (Schulman) - 3
27:17
46
Parallel RL algorithms, open problems and challenges in deep reinforcement - 1
26:14
47
Parallel RL algorithms, open problems and challenges in deep reinforcement - 2
26:22
48
Parallel RL algorithms, open problems and challenges in deep reinforcement - 3
26:11
49
Transfer in Reinforcement Learning (Finn) - 1
28:18
50
Transfer in Reinforcement Learning (Finn) - 2
28:18
51
Transfer in Reinforcement Learning (Finn) - 3
28:16
52
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 1
25:24
53
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 2
25:29
54
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 3
25:17
55
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 1
25:39
56
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 2
25:40
57
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 3
25:33
相关视频
第3/3集 · 46:32
法律硕士|华图法硕名师于越老师刑法导学课(1) - 3
考研留学
2022年11月6日
1201观看
13:28
【厦门名师课堂】高二历史:中国古代的科学技术(主讲人:徐音,中央音乐学院鼓浪屿钢琴学校文化学科教研室负责人) - 1
2022年11月9日
2010观看
第13/81集 · 14:08
【名师课堂】哲学导论(全集配字幕)--王德峰复旦教授(5) - 1
大学课程
2022年10月27日
4073观看
21:00
【直播回放】刘洋:浅谈研究生学位论文选题方法 - 2
2022年11月6日
1709观看
07:00
【名师课堂】《风险模型与非寿险精算学》授课讲师:谢远涛——对外经济贸易大学课程(1-3 估计)
轻知识
2022年11月2日
1851观看
05:14
P4学术硕士和专业硕士的区别 - 1
2022年10月25日
2080观看
00:21
专家:逼着上研究生很变态!教育的高消费只会导致越来越卷!
轻知识
1年前
954观看
35:12
年8月13日 克拉申博士参加双语教育会议记录(1) - 3
2022年11月9日
1039观看
第2/12集 · 16:12
【通识精品】经典诗词与人生(周圣伟教授:华东师范大学)【壹幕工作室】(第一节课) - 2
大学课程
2022年10月28日
1803观看
01:37
专业报考七:法学尽头是教授 1.教授令人尊敬 2.教授收入令人羡慕
轻知识
1年前
800观看
01:08
禽兽教授对女学生下手
轻知识
2月前
2835观看
09:26
北大心理学研究与论文写作 12讲 周晓林主讲(6) - 1
2022年11月4日
1564观看
01:39
高中学历摇身一变成首席科学家!江苏科技大学通报郭某学术造假事件
轻知识
7小时前
1068观看
09:59
日本史EP38 NHK高校講座 「占領と国内改革」 - 3
纪录片
2022年10月27日
1210观看
03:00
【毕业季】中国传媒大学2018届汉语言(应用语言学方向)专业毕业班鉴
轻知识
2023年7月15日
832观看
11:07
【华东师大】国家一流本科专业:数学与应用数学(主讲人:谈胜利 教授) - 1
2022年11月4日
6183观看