网易首页
52. Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 1
2023年9月23日 1011观看
加州大学伯克利分校 2017 深度增强学习课程
大学课程 / 社会学
https://www.youtube.com/playlist?list=PLkFD6_40KJIwTmSbCv9OVJB3YaO4sFwkX CS294-112 Deep Reinforcement Learning Sp17 课程主页:http://rll.berkeley.edu/deeprlcourse/
共57集
7.3万人观看
1
Introduction and course overview (Levine, Finn, Schulman) - 1
26:11
2
Introduction and course overview (Levine, Finn, Schulman) - 2
26:14
3
Introduction and course overview (Levine, Finn, Schulman) - 3
26:08
4
Supervised learning and decision making (Levine) - 1
24:06
5
Supervised learning and decision making (Levine) - 2
24:07
6
Supervised learning and decision making (Levine) - 3
24:03
7
Optimal control and planning (Levine) - 1
21:06
8
Optimal control and planning (Levine) - 2
21:13
9
Optimal control and planning (Levine) - 3
21:03
10
Learning dynamical system models from data (Levine) - 1
27:27
11
Learning dynamical system models from data (Levine) - 2
27:35
12
Learning dynamical system models from data (Levine) - 3
27:22
13
Learning policies by imitating optimal controllers (Levine) - 1
23:05
14
Learning policies by imitating optimal controllers (Levine) - 2
23:08
15
Learning policies by imitating optimal controllers (Levine) - 3
22:58
16
RL definitions, value iteration, policy iteration (Schulman) - 1
17:19
17
RL definitions, value iteration, policy iteration (Schulman) - 2
17:22
18
RL definitions, value iteration, policy iteration (Schulman) - 3
17:18
19
Reinforcement learning with policy gradients (Schulman) - 1
21:48
20
Reinforcement learning with policy gradients (Schulman) - 2
21:54
21
Reinforcement learning with policy gradients (Schulman) - 3
21:42
22
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 1
25:50
23
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 2
25:53
24
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 3
25:42
25
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 1
26:47
26
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 2
26:55
27
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 3
26:41
28
Advanced topics in imitation and safety (Finn) - 1
27:53
29
Advanced topics in imitation and safety (Finn) - 2
27:56
30
Advanced topics in imitation and safety (Finn) - 3
27:47
31
Inverse RL: acquiring objectives from demonstration (Finn) - 1
24:47
32
Inverse RL: acquiring objectives from demonstration (Finn) - 2
24:48
33
Inverse RL: acquiring objectives from demonstration (Finn) - 3
24:47
34
Advanced policy gradients: natural gradient and TRPO (Schulman) - 1
28:05
35
Advanced policy gradients: natural gradient and TRPO (Schulman) - 2
28:08
36
Advanced policy gradients: natural gradient and TRPO (Schulman) - 3
28:02
37
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 1
26:55
38
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 2
27:00
39
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 3
26:51
40
Summary of policy gradients and temporal difference methods (Schulman) - 1
24:06
41
Summary of policy gradients and temporal difference methods (Schulman) - 2
24:10
42
Summary of policy gradients and temporal difference methods (Schulman) - 3
23:59
43
The exploration problem (Schulman) - 1
27:18
44
The exploration problem (Schulman) - 2
27:18
45
The exploration problem (Schulman) - 3
27:17
46
Parallel RL algorithms, open problems and challenges in deep reinforcement - 1
26:14
47
Parallel RL algorithms, open problems and challenges in deep reinforcement - 2
26:22
48
Parallel RL algorithms, open problems and challenges in deep reinforcement - 3
26:11
49
Transfer in Reinforcement Learning (Finn) - 1
28:18
50
Transfer in Reinforcement Learning (Finn) - 2
28:18
51
Transfer in Reinforcement Learning (Finn) - 3
28:16
52
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 1
25:24
53
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 2
25:29
54
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 3
25:17
55
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 1
25:39
56
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 2
25:40
57
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 3
25:33
相关视频
第1/3集 · 34:59
【系列讲座第十四讲】中医自我健康管理 | 厦门大学医学院 王彦晖教授 - 1
大学课程
2022年11月5日
8294观看
01:33
刷牙的三个误区,北大口腔医院的巩玺博士给你解答
轻知识
2月前
1364观看
04:53
美国大学教授公开课:发掘你的潜能
轻知识
2023年8月8日
1178观看
04:42
北京阿姨女儿定居美国,自己卖房住养老院,谈在国外的生活体验!
轻知识
2月前
563观看
第1/3集 · 13:49
中南大学湘雅医学院公开课:内科学第一节 - 1
大学课程
2022年11月2日
1.8万观看
第58/87集 · 15:21
【【选专业指南】川大名师带你全方位了解所有学院专业!为你答疑解惑!快来看看吧!】华西临床医学院 - 3
大学课程
2022年10月27日
1643观看
12:49
中医不灭,天理难容?06年中南大教授发起万人签名,要求取消中医
轻知识
2023年9月22日
1652观看
第1/3集 · 34:41
复旦大学张文宏教授:感染病学总论 - 1
大学课程
2023年8月8日
3460观看
08:38
院士告诉你:来复旦上医学临床医学是种怎样的体验?
2023年8月8日
1657观看
第1/49集 · 04:18
皮肤与美容—医学专家如是说_山东大学_中国大学MOOC(慕课)
大学课程
2022年10月25日
2.8万观看
00:20
校园|一个大学教授的养生八法
轻知识
1年前
1673观看
03:01
全球第二例猪心移植手术成功完成 - 马里兰大学医学院
轻知识
2023年9月24日
1638观看
23:06
edX哈佛大学公开课CS50-第6周 - 3
2022年11月3日
1320观看
04:21
王德峰:为什么要留学,难道复旦博士不好?
轻知识
2023年8月7日
4328观看
第4/26集 · 28:55
【斯坦福大学公开课】经济学(2) - 1
大学课程
2022年11月3日
2456观看
01:31
解青春之疑,释成长之惑,听胡敏教授开学季精彩演讲,大学不迷茫
轻知识
2023年9月14日
1631观看