上海交通大学学报 ›› 2020, Vol. 54 ›› Issue (7): 705-717.doi: 10.16183/j.cnki.jsjtu.2020.99.007

• 学报(中文) • 上一篇    下一篇

采用机器学习算法的技术机会挖掘模型及应用

包清临,柴华奇,赵嵩正,王吉林   

  1. 西北工业大学 管理学院, 西安 710129
  • 出版日期:2020-07-28 发布日期:2020-07-31
  • 通讯作者: 包清临(1990-),女(藏族),甘肃省天水市人,博士生,现主要从事数据挖掘与知识管理研究. 电话(Tel.):029-84766218;E-mail:499195647@qq.com.
  • 基金资助:
    国家自然科学基金资助项目(71971170)

Model of Technology Opportunity Mining Using Machine Learning Algorithm and Its Application

BAO Qinglin,CHAI Huaqi,ZHAO Songzheng,WANG Jilin   

  1. School of Management, Northwestern Polytechnical University, Xi’an 710129, China
  • Online:2020-07-28 Published:2020-07-31

摘要: 现有技术机会挖掘结果的应用性较低,究其原因,一是样本量较小,二是挖掘过程缺乏对技术应用前景的评估.为解决这一问题,以提升挖掘结果的应用性为目标,以海量专利为样本,在现有研究的基础上,加入对技术应用前景的评估,提出三维的专利预测模型.采用机器学习下的PLSA算法,结合Hadoop平台下的MapReduce计算框架,运用专利文本挖掘,构建专利预测模型的技术维和功效维;采用熵权和TOPSIS法构建专利预测模型的价值维;基于MapReduce计算框架填充专利预测模型的单元项.并以DII数据库中钛领域1999~2018年 133508 例专利文本为样本应用了专利预测模型.结果显示,该模型在钛领域内共挖掘出了3个优先级和2个次级的技术机会,可以按优先顺序对技术机会进行开发.该模型丰富了技术机会挖掘的方法,为创新主体指明了更为准确和前景化的技术研发方向.

关键词: 机器学习算法, 专利文本挖掘, 技术机会,

Abstract: The applicability of the existing technology opportunity mining results is relatively low owing to the small sample size and the lack of evaluation on the technology application prospects in the mining process. In order to solve this problem, with the goal of improving the applicability of mining results, based on the existing research, this paper proposes a three-dimensional patent prediction model by taking into account a large number of patents and adding an assessment of the prospects for technology applications. Using the PLSA algorithm in machine learning and combining with the MapReduce computing framework under Hadoop, it uses patent text mining to construct the technology and function dimensions of the patent prediction model, adopts entropy weight and TOPSIS method to construct the value dimension of the patent prediction model, and fills the element items in the patent forecasting model based on the MapReduce computing framework. Then, it applies a patent prediction model to 133508 patent texts in the titanium field in the DII database from 1999 to 2018. The results show that the model has identified a total of 3 priority and 2 secondary technology opportunities in the titanium field, and these technology opportunities can be developed in order of priority. This model enriches the method for technology opportunity mining and provides a more accurate and prospective technology research and development direction for innovation subjects.

Key words: machine learning algorithms, patent text mining, technology opportunity, titanium

中图分类号: