网易首页
51. Transfer in Reinforcement Learning (Finn) - 3
2023年9月23日 984观看
加州大学伯克利分校 2017 深度增强学习课程
大学课程 / 社会学
https://www.youtube.com/playlist?list=PLkFD6_40KJIwTmSbCv9OVJB3YaO4sFwkX CS294-112 Deep Reinforcement Learning Sp17 课程主页:http://rll.berkeley.edu/deeprlcourse/
共57集
7.3万人观看
1
Introduction and course overview (Levine, Finn, Schulman) - 1
26:11
2
Introduction and course overview (Levine, Finn, Schulman) - 2
26:14
3
Introduction and course overview (Levine, Finn, Schulman) - 3
26:08
4
Supervised learning and decision making (Levine) - 1
24:06
5
Supervised learning and decision making (Levine) - 2
24:07
6
Supervised learning and decision making (Levine) - 3
24:03
7
Optimal control and planning (Levine) - 1
21:06
8
Optimal control and planning (Levine) - 2
21:13
9
Optimal control and planning (Levine) - 3
21:03
10
Learning dynamical system models from data (Levine) - 1
27:27
11
Learning dynamical system models from data (Levine) - 2
27:35
12
Learning dynamical system models from data (Levine) - 3
27:22
13
Learning policies by imitating optimal controllers (Levine) - 1
23:05
14
Learning policies by imitating optimal controllers (Levine) - 2
23:08
15
Learning policies by imitating optimal controllers (Levine) - 3
22:58
16
RL definitions, value iteration, policy iteration (Schulman) - 1
17:19
17
RL definitions, value iteration, policy iteration (Schulman) - 2
17:22
18
RL definitions, value iteration, policy iteration (Schulman) - 3
17:18
19
Reinforcement learning with policy gradients (Schulman) - 1
21:48
20
Reinforcement learning with policy gradients (Schulman) - 2
21:54
21
Reinforcement learning with policy gradients (Schulman) - 3
21:42
22
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 1
25:50
23
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 2
25:53
24
Learning Q-functions: Q-learning, SARSA, and others (Schulman) - 3
25:42
25
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 1
26:47
26
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 2
26:55
27
Advanced Q-learning: replay buffers, target networks, double Q-learning (Sc - 3
26:41
28
Advanced topics in imitation and safety (Finn) - 1
27:53
29
Advanced topics in imitation and safety (Finn) - 2
27:56
30
Advanced topics in imitation and safety (Finn) - 3
27:47
31
Inverse RL: acquiring objectives from demonstration (Finn) - 1
24:47
32
Inverse RL: acquiring objectives from demonstration (Finn) - 2
24:48
33
Inverse RL: acquiring objectives from demonstration (Finn) - 3
24:47
34
Advanced policy gradients: natural gradient and TRPO (Schulman) - 1
28:05
35
Advanced policy gradients: natural gradient and TRPO (Schulman) - 2
28:08
36
Advanced policy gradients: natural gradient and TRPO (Schulman) - 3
28:02
37
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 1
26:55
38
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 2
27:00
39
Policy gradient variance reduction and actor-critic algorithms (Schulman) - 3
26:51
40
Summary of policy gradients and temporal difference methods (Schulman) - 1
24:06
41
Summary of policy gradients and temporal difference methods (Schulman) - 2
24:10
42
Summary of policy gradients and temporal difference methods (Schulman) - 3
23:59
43
The exploration problem (Schulman) - 1
27:18
44
The exploration problem (Schulman) - 2
27:18
45
The exploration problem (Schulman) - 3
27:17
46
Parallel RL algorithms, open problems and challenges in deep reinforcement - 1
26:14
47
Parallel RL algorithms, open problems and challenges in deep reinforcement - 2
26:22
48
Parallel RL algorithms, open problems and challenges in deep reinforcement - 3
26:11
49
Transfer in Reinforcement Learning (Finn) - 1
28:18
50
Transfer in Reinforcement Learning (Finn) - 2
28:18
51
Transfer in Reinforcement Learning (Finn) - 3
28:16
52
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 1
25:24
53
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 2
25:29
54
Neural Architecture Search with Reinforcement Learning: Quoc Le and Barret Z - 3
25:17
55
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 1
25:39
56
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 2
25:40
57
Generalization and Safety in Reinforcement Learning and Control: Aviv Tamar - 3
25:33
相关视频
07:41
中国近代哲学,东西方思想和文化的差异
轻知识
1年前
3447观看
05:18
第3集 至圣先师孔子的哲学1
轻知识
1年前
1646观看
01:49
了解阴阳,其实就是在了解我们如何的多面世界。 #好书推荐 #阴阳 #东方智慧 #哲学思想
轻知识
3月前
1141观看
05:37
傅佩荣:哲学为什么给人希望?只要还有这三件事,人间就值得
轻知识
8月前
2326观看
01:27
傅佩荣:到底中国人需要多少西方哲学?
轻知识
8月前
1379观看
02:40
哲学开始于惊疑。惊疑这个词,在希腊文里包含两重意思,一是惊奇,二是疑惑。惊奇和疑惑其实是两种很不同的...
轻知识
2023年10月7日
4446观看
08:24
韩非子的法家思想创新在哪里?他的辩证法讲了什么内容?
轻知识
1年前
3571观看
02:05
傅佩荣:古代一个人现在很多人喜欢,但我讲儒家从来不谈他
轻知识
1年前
4403观看
00:49
对古今中西之争,马克思和孔子怎么看?马克思主义基本原理和中华优秀传统文化又是如何“结合”的?
轻知识
2023年10月12日
2807观看
06:49
自我就是本真的自我吗?拉康哲学的真实界,不可能的存在之真
轻知识
1年前
1.1万观看
37:44
哲学家加长版第22期:当代人的真理观和价值观
轻知识
7月前
2543观看
01:35
傅佩荣:哲学千万别这样读,不是害自己吗?一般人读这三本就够了
轻知识
1年前
4681观看
07:49
种躺平姿势你是哪一种?正确的躺下是对抗内卷的生活哲学,不会吧?
轻知识
2021年10月15日
1.8万观看
01:25
想读哲学,但实在读不下去?没关系,从这本小书读起
轻知识
10月前
1953观看
第3/17集 · 12:19
如何理解当代哲学热点问题 - 3
大学课程
2022年8月18日
3051观看
02:40
三分钟哲学思辨,你如何才能确定你是真实存在的?
轻知识
2023年8月8日
3088观看