上海交通大学学报(自然版)

• 自动化技术、计算机技术 • 上一篇    下一篇

ID3算法的改进和简化

朱颢东   

  1. (1.郑州轻工业学院 计算机与通信工程学院, 郑州 450002;2.中国科学院 成都计算机应用研究所, 成都 610041; 3.中国科学院 研究生院, 北京 100039)
  • 收稿日期:2009-09-24 修回日期:1900-01-01 出版日期:2010-07-28 发布日期:2010-07-28

Research on Improvement and Simplification of ID3 Algorithm

ZHU Haodong   

  1. (1.School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China;2.Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610041, China;3.The Graduate School of the Chinese Academy of Sciences, Beijing 100039, China)
  • Received:2009-09-24 Revised:1900-01-01 Online:2010-07-28 Published:2010-07-28

摘要: 针对ID3算法倾向于选择取值较多的属性的缺点,引进属性重要性来改进ID3算法,并根据改进的ID3算法中信息增益的计算特点,利用凸函数的性质来简化该算法.实验表明,优化的ID3 算法与原ID3 算法相比,在构造决策树时具有较高的准确率和更快的计算速度,并且构造的决策树还具有较少的平均叶子数.

关键词: 决策树, ID3算法, 属性重要性, 信息增益

Abstract: For the shortcoming that ID3 algorithm tends to choose attribute which has many values, attribute importance was introduced to improve ID3 algorithm. Next, according to the character of information gain, the improved ID3 algorithm was simplified to reduce the complexity of computing information gain by the convex function. Through experiment testing, the optimized ID3 algorithm can spend much less time to construct the high accurate decision tree and this decision tree has less average leaves.