Nature重磅：海马体中存在对未来奖励的预测编码|细胞|信号|中脑|神经元

Nature重磅：海马体中存在对未来奖励的预测编码

分享至

基本信息

Title:Predictive coding of reward in the hippocampus

发表时间：2025.1.14

发表期刊:Nature

影响因子：48.5

获取原文：

添加小助手:PSY-Brain-Frontier即可获取PDF版本
点击页面底部“”即可跳转论文原网页

研究背景

在认知神经科学的经典叙事中，海马体（Hippocampus）常被誉为大脑的“GPS”。自O'Keefe和Nadel提出“认知地图（Cognitive Map）”理论以来，我们一直认为海马体的主要职责是构建环境的空间表征，帮助我们在复杂的迷宫中找到出路。

然而，生存不仅仅是关于“我在哪里”，更重要的是“哪里有奖励”以及“奖励何时出现”。从进化的角度来看，动物必须高效地学习和记忆那些与奖励相关联的体验。过去的研究确实发现，海马神经元对奖励高度敏感：当动物接近或到达奖励位置时，特定位置细胞（Place Cells）的放电频率会显著增加，甚至形成对奖励位置的“过度表征（Over-representation）” 。

但是，这些经典的观察往往来自于短期的横断研究。一个关键且悬而未决的问题是：这种奖励表征是静态的吗？当动物日复一日地执行任务，对环境规律烂熟于心后，海马体对奖励的编码会发生变化吗？

如果海马体不仅仅是一张被动的地图，而是一个主动预测未来的模型，那么随着学习的深入，它的神经活动应当不再局限于对“当下奖励”的被动响应，而是转向对“未来奖励”的主动预测。为了验证这一假设，来自麦吉尔大学和哈佛大学的研究团队利用钙成像技术，对小鼠进行了长达数周的纵向追踪，揭示了海马体在长期学习过程中对奖赏预测编码的动态重构机制。

研究核心总结

研究团队利用头戴式微型显微镜（Miniscope）结合自动化触摸屏任务，在小鼠背侧海马CA1区进行了长时程的在体钙成像记录。通过对同一群神经元进行跨越数周的追踪（Tracking），研究者发现了海马奖励表征在时间维度上的系统性重构，强有力地支持了海马体的“预测编码”假说。

Fig. 1 | Imaging of CA1 neuronal activity in mice while they perform a reward-based task.

奖励编码随经验积累而衰减，前兆线索编码增强

研究发现，随着小鼠对任务的熟练掌握（经验增加），海马CA1区对“奖励本身”的编码强度在群体水平和单细胞水平上均呈现显著下降趋势。具体而言，被识别为“奖励细胞（Reward cells）”的比例随训练天数减少。与此同时，海马体并未“闲着”，它转而增强了对“奖励前兆特征”的表征。无论是对屏幕线索（Screen/Cue）的响应，还是在从选择点奔向奖励口（Reward approach）的过程中，相关神经元的信息含量和被募集的细胞比例均随经验积累而显著增加。这表明海马体的表征重点从“结果”转移到了“预测结果的线索”上。

Fig. 2 | Dynamics of reward encoding during learning.

神经元活动的“反向移动”

利用细胞配准技术追踪同一神经元，研究者观察到了一种令人惊叹的动态现象：起初对奖励时刻有强烈响应的神经元，并没有简单地停止放电，而是将其放电时刻逐渐在时间轴上向前推移（Backward shift）。具体来说，原本在“享用奖励”阶段放电的细胞，随着训练天数的增加，逐渐演变为在“接近奖励”甚至“看到线索”阶段放电。这种现象与中脑多巴胺神经元中经典的“奖励预测误差（Reward Prediction Error, RPE）”信号的时间转移高度相似。

Fig. 3 | Dynamics of pre-reward encoding across learning.

机制解释：基于时序差分学习的计算模型

为了解释这一现象，研究团队构建了一个结合高斯基函数（Gaussian basis functions）空间特征的时序差分强化学习（TDRL）模型。模型模拟显示，如果海马体试图通过最小化TD误差来学习状态价值（State Value），那么TD误差信号会从奖励状态向起始状态反向传播。这种误差信号驱动了位置场（Place Fields）的重塑，导致神经元的峰值活性从奖励位置向预测奖励的线索位置反向移动。模型结果完美复现了实验中观察到的三种主要模式：奖励近端细胞的反向移动、接近细胞的动态调整以及线索细胞的后期涌现。

Fig. 4 | Weeks-long backward shift of reward encoding during learning.

研究意义

这项研究从单一细胞到计算模型，全方位证实了海马体并非一个静态的空间存储器，而是一个具备预测编码（Predictive Coding）能力的动态系统。该发现不仅揭示了海马体在长期记忆巩固和表征漂移（Representational Drift）中的规律，更建立了海马认知地图与强化学习理论（特别是TD Learning）之间的直接神经生理学联系。它暗示海马体通过不断修正其内部模型，将当下的感知与对未来的预期无缝融合，从而实现对未来奖励的高效预测与规划。

Fig. 5 | TD error drives backward shifting of place fields.

Abstract

Anticipating future outcomes is a fundamental task of the brain. This process requires learning the states of the world as well as the transitional relationships between those states. In rodents, the hippocampal spatial cognitive map is thought to be one such internal model. However, evidence for predictive coding and reward sensitivity in the hippocampal neuronal representation suggests that its role extends beyond purely spatial representation. How this reward representation evolves over extended experience remains unclear. Here we track the evolution of the hippocampal reward representation over weeks as mice learn to solve a cognitively demanding reward-based task. We find several lines of evidence, both at the population and the single-cell level, indicating that the hippocampal representation becomes predictive of reward as the mouse learns the task over several weeks. Both the population-level encoding of reward and the proportion of reward-tuned neurons decrease with experience. At the same time, the representation of features that precede the reward increases with experience. By tracking reward-tuned neurons over time, we find that their activity gradually shifts from encoding the reward itself to representing preceding task features, indicating that experience drives a backward-shifted reorganization of neural activity to anticipate reward. We show that a temporal difference model of place fields recapitulates these results. Our findings underscore the dynamic nature of hippocampal representations, and highlight their role in learning through the prediction of future outcomes.

特别声明：以上内容(如有图片或视频亦包括在内)为自媒体平台“网易号”用户上传并发布，本平台仅提供信息存储服务。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.