题目:
Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing
作者:
Andrea Galassi, Marco Lippi, Paolo Torroni
Computation and Language (cs.CL)
(Submitted on 4 Feb 2019)
链接:
https://arxiv.org/abs/1902.02181
摘要
注意力机制是一种在各种神经架构中,越来越流行。由于该领域的快速发展,人们仍然缺乏对系统的关注。在本文中,我们为自然语言处理的注意力架构定义了一个统一模型,重点是设计用于处理文本数据的矢量表示的架构。我们讨论提案不同的维度,关注的可能用途,并绘制该领域的主要研究活动和公开挑战。
要点
图1所示。RNNsearch结构(Bahdanau et al., 2015)(左)它的注意力模型(右)。
图2所示。注意力模型的核心。
图3所示。一般类型的注意力模型。
图4所示。注意力在序列到序列模型中的例子。
图5所示。Yang et al. (2016b)(左),Zhao and Zhang(2018)(中),Ma et al.(2018)(右)定义的分层输入注意模型。从左到右依次应用不同层次的注意功能。
图6所示。: Lu et al.(2016)(左)和Ma et al.(2017)(右)的粗粒度联合注意模型。
图7所示。dos Santos et al.(2016)(左)和Cui et al.(2017)(右)提出的细粒度共同注意模型。虚线显示了最大池/分布函数是如何执行的(按列或按行)。
英文原文
Attention is an increasingly popular mechanism used in a wide range of neural architectures. Because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures for natural language processing, with a focus on architectures designed to work with vector representation of the textual data. We discuss the dimensions along which proposals differ, the possible uses of attention, and chart the major research activities and open challenges in the area.
特别声明:以上内容(如有图片或视频亦包括在内)为自媒体平台“网易号”用户上传并发布,本平台仅提供信息存储服务。
Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.