基于互逆强化模型和数理统计方法分析专家评分偏差问题与建议
作者:
中图分类号:

F204;G311


Analyze the problems and suggestions of expert score bias based on inverse reinforcement model and mathematical statistics methods
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • | | |
  • 文章评论
    摘要:

    "十二五"期间,高技术研究发展计划(863计划)作为引领科技发展的主要抓手之一,为提升中国整体科技实力和创新能力发挥了重要作用.验收评审专家有着评估课题完成水平、衡量科研成果产出价值的关键作用,其评分可靠性直接关系着计划实施成效评价的合理性.鉴于此,本文结合互逆强化模型和数理统计方法,以"十二五"863计划某领域课题验收为例,系统分析了专家评分的偏差问题并按表现将专家划分为8种偏差类型并分别给出应对建议.结果表明:该领域技术验收专家评分整体合理,多数专家均能给出可靠评分;宽严尺度虽然略有差异,但处于可接受范围.本文研究将为国家科技计划相关评审工作乃至其他科研管理活动中合理开展专家评价、精细化规范评审行为以及完善领域专家库、遴选专家评委人选提供参考.

    Abstract:

    During the 12th Five Year Plan period,National High Technology Research and Development Program of China (863 Program),as one of the main drivers of science and technology development,has provided important support for improving China's scientific and technological strength and innovation ability.Acceptance experts play a key role in assessing the level of subject completion and measuring the value of scientific research achievements.The reliability of their scores is directly related to the rationality of 863 program implementation evaluation.Therefore,taking the subject acceptance in a certain field as an example,this paper combines an inverse reinforcement-based model and mathematical statistics methods to systematically analyze the rating bias of technical acceptance experts.Finally,these experts are divided into 8 categories according to their rating performances and corresponding suggestions are given respectively.The results show that the scores are reasonable as a whole,and most experts can give reliable scores;although there are some differences in the rating scales,they are basically in an acceptable range.This study will provide a reference for the review work related to national science and technology plan,and other scientific research management activities,in order to reasonably carry out expert evaluation,refine and standardize review behaviors,improve field expert databases,and select acceptance expert candidates.

    参考文献
    [1] 中国科学技术部国家遥感中心.羲和系统信号获取说明[EB/OL].(2014-06-10)[2020-03-02].http://csi.gov.cn/nrscc/kjjhgl/gj863jh/863tzgg/201406/t20140610_32726.html The National Remote Sensing Center of China(NRSCC).Description of Xihe system signal acquisition[EB/OL].(2014-06-10)[2020-03-02].http://csi.gov.cn/nrscc/kjjhgl/gj863jh/863tzgg/201406/t20140610_32726.html
    [2] 中国科学技术部.中国反射面天线技术引领国际大科学工程核心设备研制[EB/OL].(2018-10-26)[2020-03-02].http://www.most.gov.cn/gnwkjdt/201810/t20181026_142436.htm Ministry of Science and Technology of the People's Republic of China.China's reflector antenna technology leads the development of core equipment for international mega-science project[EB/OL].(2018-10-26)[2020-03-02].http://www.most.gov.cn/gnwkjdt/201810/t20181026_142436.htm
    [3] 彭龙.科研活动的实力评价及评价的偏差问题[J].科学管理研究,1993,11(3):42-45 PENG Long.The deviation of strength evaluation and evaluation of scientific research activities[J].Scientific Management Research,1993,11(3):42-45
    [4] 谢焕瑛,张健.国家重点实验室评估专家的若干问题研究[J].研究与发展管理,2006,18(4):108-111 XIE Huanying,ZHANG Jian.Research on the problems of the reviewers of the state key laboratory[J].R & D Management,2006,18(4):108-111
    [5] 谢焕瑛.国家重点实验室评估专家评分偏差效应分析[J].研究与发展管理,2006,18(6):134-138 XIE Huanying.Analysis on the warp of the review of the state key laboratory[J].R & D Management,2006,18(6):134-138
    [6] 张健,谢焕瑛.国家重点实验室评估方法的若干问题研究[J].管理学报,2008,5(2):279-281 ZHANG Jian,XIE Huanying.Research on several problems on the evaluation of the state key laboratory[J].Chinese Journal of Management,2008,5(2):279-281
    [7] 杨晓秋.关于国家重点实验室评估的思考[J].实验室研究与探索,2015,34(9):141-144,148 YANG Xiaoqiu.Thoughts on the evaluation of the state key laboratory[J].Research and Exploration in Laboratory,2015,34(9):141-144,148
    [8] 孙晓敏,张厚粲.国家公务员结构化面试中评委偏差的IRT分析[J].心理学报,2006,38(4):614-625 SUN Xiaomin,ZHANG Houcan.An IRT analysis of rater bias in structured interview of national civilian candidates[J].Acta Psychologica Sinica,2006,38(4):614-625
    [9] 苏永华,柴雪,丁玉洋.国家公务员录用面试初步研究[J].应用心理学,1998,4(1):15-20 SU Xuehua,CHAI Xue,DING Yuyang.Preliminary research on interview of national civil servant recruitment[J].Chinese Journal of Applied Psychology,1998,4(1):15-20
    [10] 陈宛玉,戴海琦.教育教学能力测试的GT和多面Rasch模型分析[J].考试研究,2013,9(3):70-78 CHEN Wanyu,DAI Haiqi.Analysis of GT and multidimensional Rasch model for educational and teaching capability testing[J].Examination Research,2013,9(3):70-78
    [11] 孙晓敏,薛刚.多面Rasch模型在结构化面试中的应用[J].心理学报,2008,40(9):1030-1039 SUN Xiaomin,XUE Gang.A many-faceted Rasch model analysis of structured interview[J].Acta Psychologica Sinica,2008,40(9):1030-1039
    [12] 周燕,曾用强.机助英语听说考试计算机自动评分的多层面Rasch模型分析[J].外语测试与教学,2016,6(1):22-31 ZHOU Yan,ZENG Yongqiang.Multi level Rasch model analysis of computer-aided English listening and speaking test[J].Foreign Language Testing & Teaching,2016,6(1):22-31
    [13] 王佶旻,李潇.基于Rasch模型的参数估计方法比较研究[J].中国考试,2017,14(9):11-21 WANG Jimin,LI Xiao.The comparison between the method of MLE,MLE/EM and BMES under the Rasch model[J].China Examinations,2017,14(9):11-21
    [14] Lauw H W,Lim E P,Wang K.Bias and controversy in evaluation systems[J].IEEE Transactions on Knowledge and Data Engineering,2008,20(11):1490-1504
    [15] Dai H B,Zhu F D,Lim E P,et al.Detecting anomalies in bipartite graphs with mutual dependency principles[C]//Proceedings of the Twelfth IEEE International Conference on Data Mining,2012:172-180
    [16] Xie H,Lui J C S.Incentive mechanism and rating system design for crowdsourcing systems:analysis,tradeoffs and inference[J].IEEE Transactions on Services Computing,2018,11(1):90-102
    [17] Zielinski K,Nielek R,Wierzbicki A,et al.Computing controversy:formal model and algorithms for detecting controversy on Wikipedia and in search queries[J].Information Processing & Management,2018,54(1):14-36
    [18] 吕书龙,梁飞豹,刘文丽.关于评委评分的评价模型[J].福州大学学报(自然科学报),2010,38(3):358-362 LÜ Shulong,LIANG Feibao,LIU Wenli.An evaluation model of rater score[J].Journal of Fuzhou University(Natural Science Edition),2010,38(3):358-362
    [19] 梁薇.基于投影寻踪模型的网评评委综合素质评价[J].统计与决策,2017,33(23):60-63 LIANG Wei.Evaluation of comprehensive quality of web evaluation judges based on projection pursuit model[J].Statistics and Decision,2017,33(23):60-63
    [20] Gao M,Chen L,Li B,et al.Projection-based link prediction in a bipartite network[J].Information Sciences,2017,376:158-171
    [21] Liu W,Wu S,Wu X,et al.Mixed probability inverse depth estimation based on probabilistic graph model[J].IEEE Access,2019,7:72591-72603
    [22] Soldi G,Meyer F,Braca P,et al.Self-tuning algorithms for multisensor-multitarget tracking using belief propagation[J].IEEE Transactions on Signal Processing,2019,67(15):3922-3937
    [23] Berkhin P.A survey on PageRank computing[J].Internet Mathematics,2005,2(1):73-120
    [24] Otsu N.A threshold selection method from gray-level histograms[J].IEEE Transactions on Systems,Man,and Cybernetics,1979,9(1):62-66
    [25] 中共中央办公厅国务院办公厅印发《关于深化项目评审、人才评价、机构评估改革的意见》[EB/OL].(2018-07-03)[2020-03-02].http://www.gov.cn/zhengce/2018-07/03/content_5303251.htm
    [26] 教育部科技部印发《关于规范高等学校SCI论文相关指标使用树立正确评价导向的若干意见》的通知[EB/OL].(2020-02-18)[2020-03-02].http://www.gov.cn/zhengce/zhengceku/2020-03/03/content_5486229.htm
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

田林琳,孙维东,张弛,郭明,韦纳都.基于互逆强化模型和数理统计方法分析专家评分偏差问题与建议[J].南京信息工程大学学报(自然科学版),2020,12(5):577-590
TIAN Linlin, SUN Weidong, ZHANG Chi, GUO Ming, WEI Nadu. Analyze the problems and suggestions of expert score bias based on inverse reinforcement model and mathematical statistics methods[J]. Journal of Nanjing University of Information Science & Technology, 2020,12(5):577-590

复制
分享
文章指标
  • 点击次数:453
  • 下载次数: 2338
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2020-07-28
  • 在线发布日期: 2020-10-29

地址:江苏省南京市宁六路219号    邮编:210044

联系电话:025-58731025    E-mail:nxdxb@nuist.edu.cn

南京信息工程大学学报 ® 2025 版权所有  技术支持:北京勤云科技发展有限公司