Journal of Shanghai Jiao Tong University (Science) ›› 2018, Vol. 23 ›› Issue (5): 620-626.doi: 10.1007/s12204-018-1961-6

• • 上一篇    下一篇

Fine-Grained Opinion Extraction from Chinese Car Reviews with an Integrated Strategy

WANG Yinglin (王英林), WANG Ming (王明)   

  1. (School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai 200433, China)
  • 出版日期:2018-10-01 发布日期:2018-10-07
  • 通讯作者: WANG Yinglin (王英林) E-mail: wang.yinglin@shufe.edu.cn

Fine-Grained Opinion Extraction from Chinese Car Reviews with an Integrated Strategy

WANG Yinglin (王英林), WANG Ming (王明)   

  1. (School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai 200433, China)
  • Online:2018-10-01 Published:2018-10-07
  • Contact: WANG Yinglin (王英林) E-mail: wang.yinglin@shufe.edu.cn

摘要: With rapid development of E-commerce, a large amount of data including reviews about different types of products can be accessed within short time. On top of this, opinion mining is becoming increasingly effective to extract valuable information for product design, improvement and brand marketing, especially with fine-grained opinion mining. However, limited by the unstructured and causal expression of opinions, one cannot extract valuable information conveniently. In this paper, we propose an integrated strategy to automatically extract feature-based information, with which one can easily acquire detailed opinion about certain products. For adaptation to the reviews’ characteristics, our strategy is made up of a multi-label classification (MLC) for reviews, a binary classification (BC) for sentences and a sentence-level sequence labelling with a deep learning method. During experiment, our approach achieves 82% accuracy in the final sequence labelling task under the setting of a 20-fold cross validation. In addition, the strategy can be expediently employed in other reviews as long as there is an according amount of labelled data for startup.

关键词: opinion extraction, multi-label classification (MLC), binary classification (BC), sequence labelling, recurrent neural network (RNN)

Abstract: With rapid development of E-commerce, a large amount of data including reviews about different types of products can be accessed within short time. On top of this, opinion mining is becoming increasingly effective to extract valuable information for product design, improvement and brand marketing, especially with fine-grained opinion mining. However, limited by the unstructured and causal expression of opinions, one cannot extract valuable information conveniently. In this paper, we propose an integrated strategy to automatically extract feature-based information, with which one can easily acquire detailed opinion about certain products. For adaptation to the reviews’ characteristics, our strategy is made up of a multi-label classification (MLC) for reviews, a binary classification (BC) for sentences and a sentence-level sequence labelling with a deep learning method. During experiment, our approach achieves 82% accuracy in the final sequence labelling task under the setting of a 20-fold cross validation. In addition, the strategy can be expediently employed in other reviews as long as there is an according amount of labelled data for startup.

Key words: opinion extraction, multi-label classification (MLC), binary classification (BC), sequence labelling, recurrent neural network (RNN)

中图分类号: