上海交通大学学报(自然版)

• 自动化技术、计算机技术 • 上一篇    下一篇

基于含噪音日志的流程挖掘阈值优化设置

阮莹1,苏强1,2,4,张国通3,刘大庆3戴红芳3,张音赟3,朱岩4,薛镭4   

  1. ( 1. 上海交通大学 工业工程与物流工程系,上海 200240; 2. 同济大学 经济与管理学院, 上海 200092;3. 上海市第七人民医院,上海 200137; 4. 清华大学 经济管理学院, 北京 100084)
  • 收稿日期:2009-03-05 修回日期:1900-01-01 出版日期:2010-02-26 发布日期:2010-02-26

Optimal Setting of the Threshold in Mining Process Model from Noised Log

RUAN Ying1,SU Qiang1,2,4,ZHANG Guotong3,LIU Daqing3DAI Hongfang3,ZHANG Yinbin3,ZHU Yan4,XUE Lei4   

  1. (1.Department of Industrial Engineering and Logistic Management, Shanghai JiaotongUniversity,Shanghai 200240, China; 2. School of Economics & Management, Tongji University, Shanghai 200092, China;3. Shanghai Seventh People’s Hospital, Shanghai 200137, China; 4. School of Economics and Management, Tsinghua University, Beijing 100084, China)
  • Received:2009-03-05 Revised:1900-01-01 Online:2010-02-26 Published:2010-02-26

摘要: 针对处理日志中噪音数据的启发式流程挖掘算法中阈值设置的不确定性,提出了基于试验设计的阈值优化设置方法.以阈值作为变量,挖掘得到的流程模型与实际日志的符合度作为响应量,通过试验设计分析方法优化阈值的配置,并将该算法应用于医院某病种诊疗流程的挖掘.结果表明,通过该阈值设置方法能挖掘出正确合理的流程模型.

关键词: 流程挖掘, 噪音数据, 阈值, 插值法, 试验设计

Abstract: In view of the uncertainty of the settlement of the threshold in the heuristic process mining method proposed by Aalst to deal with the noise data in the log, a method of optimization settlement of threshold based on design of experiment (DOE) analysis was proposed. The threshold is dealt as variable, and the fitness of the model that is mined as response variable, the goal is to find the most optimal combination of threshold value that will result in the most appropriate workflow model. Finally, this method was applied to mine Caesarean birth diagnosis flow. The result demonstrates that this method can find an optimal combination of threshold that result in an appropriate workflow model.

中图分类号: