响应时间约束的代码评审人推荐
作者:胡渊喆,王俊杰,李守斌,胡军,王青
1. 中国科学院软件研究所互联网软件技术实验室
2. 中国科学院大学
3. 计算机科学国家重点实验室(中国科学院软件研究所)
摘要
同行代码评审,即对提交代码进行人工评审,是减少软件缺陷和提高软件质量的有效手段,已被Github等开源社区以及很多软件开发组织广泛采用.在GitHub社区,代码评审是其pull-based软件开发模型的重要组成部分.开源项目往往存在成百上千个候选评审人员,为评审工作推荐合适的评审人员是一项很有价值且挑战性的工作.基于真实开源项目的数据分析发现,评审响应时间过长是普遍存在的问题,这会延长评审周期、降低参与人员积极性,而已有的代码评审人推荐工作均没有考虑响应时间这个因素.因此,提出了响应时间约束的代码评审人推荐问题,即推荐的评审人能否在约定时间内进行评审;进而提出了基于多目标优化的代码评审人推荐方法(MOC2R),该方法通过最大化代码评审人经验、最大化在约定时间内的响应概率、最大化人员最近时间内的活跃性这3个目标,使用多目标优化算法来推荐代码评审人员.基于6个开源项目的数据进行实验,结果表明,在不同时间窗约束下(2h、4h、8h),Top-1准确率为41.7%~61.5%,Top-5准确率为66.5%~77.7%,显著优于两条常用且业内领先的基线方法,且3个目标均对人员推荐有贡献,其中,约定时间内的响应概率目标对于人员推荐的贡献最大.该方法能够进一步提升代码评审效率,提高开源社区的活跃性.
代码评审,响应时间约束,多目标优化
Response Time Constrained Code Reviewer Recommendation
Author: HU Yuan-Zhe,WANG Jun-Jie,LI Shou-Bin,HU Jun,WANG Qing
Abstract
Peer code review, or manual review of submitted code, which is an effective way to reduce defects and improve quality, has been widely adopted by open source communities and many software development organizations, such as Github. In the GitHub community, code reviews are an important part of its pull-based software development model. Open source projects often have hundreds or thousands of candidate reviewers, recommend suitable reviewers for code review is a very valuable and challenging work. Based on the data analysis of real open source projects, it is found that the response time of review is a common problem, which will extend the review cycle and reduce the enthusiasm of participants. Existed work did not take the response time into account. Therefore, the code reviewer recommendation problem is proposed with response time constraint, and then the code reviewer recommendation method (MOC2R) is proposed based on multi-objective optimization by maximizing the experience of code reviewers, maximizing the response probability within the time window, and maximizing the activity of staff within the latest time. The experiments are conducted based on data from six open source projects, and the results show that under different time window constraints (2h, 4h, 8h), Top-1 accuracy rate is 41.7%~61.5%, Top-5 accuracy rate is 66.5%~77.7%, significantly better than the two commonly used and industry-leading baseline methods, and all three objectives contributed to the recommendation among which the response probability within the time window contributes the most. The proposed method can further enhance code review efficiency, improve the activity of the open source community.
Key words
code review,response time constrained,multi-objective optimization
收稿日期
2020-02-05
基金项目
国家重点研发计划(2018YFB1403400)
作者简介
["胡渊喆(1988-),男,工程师,主要研究领域为软件过程管理,代码评审人推荐","胡军(1979-),男,博士,高级工程师,CCF专业会员,主要研究领域为数据挖掘,知识工程,软件过程管理","王俊杰(1987-),女,博士,副研究员,主要研究领域为缺陷预测,经验软件工程,众测","王青(1964-),女,博士,研究员,博士生导师,CCF高级会员,主要研究领域为软件过程技术,需求工程,软件质量与管理","李守斌(1987-),男,工程师,主要研究领域为自然语言理解,数据挖掘"]
10.13328/j.cnki.jos.006079
特别声明:以上内容(如有图片或视频亦包括在内)为自媒体平台“网易号”用户上传并发布,本平台仅提供信息存储服务。
Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.