Journal of Shanghai Jiao Tong University (Science) ›› 2020, Vol. 25 ›› Issue (3): 325-332.doi: 10.1007/s12204-020-2184-1

Previous Articles     Next Articles

Fine-Grained Opinion Mining on Chinese Car Reviews with Conditional Random Field

Fine-Grained Opinion Mining on Chinese Car Reviews with Conditional Random Field

WANG Yinglin (王英林)   

  1. (School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai 200433, China)
  2. (School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai 200433, China)
  • Online:2020-06-15 Published:2020-05-29
  • Contact: WANG Yinglin (王英林) E-mail:wang.yinglin@shufe.edu.cn

Abstract: Nowadays, the Internet has penetrated into all aspects of people’s lives. A large number of online customer reviews have been accumulated in several product forums, which are valuable resources to be analyzed. However, these customer reviews are unstructured textual data, in which a lot of ambiguities exist, so analyzing them is a challenging task. At present, the effective deep semantic or fine-grained analysis of customer reviews is rare in the existing literature, and the analysis quality of most studies is also low. Therefore, in this paper a fine-grained opinion mining method is introduced to extract the detailed semantic information of opinions from multiple perspectives and aspects from Chinese automobile reviews. The conditional random field (CRF) model is used in this method, in which semantic roles are divided into two groups. One group relates to the objects being reviewed, which includes the roles of manufacturer, the brand, the type, and the aspects of cars. The other group of semantic roles is about the opinions of the objects, which includes the sentiment description, the aspect value, the conditions of opinions and the sentiment tendency. The overall framework of the method includes three major steps. The first step distinguishes the relevant sentences with the irrelevant sentences in the reviews. At the second step the relevant sentences are further classified into different aspects. At the third step fine-grained semantic roles are extracted from sentences of each aspect. The data used in the training process is manually annotated in fine granularity of semantic roles. The features used in this CRF model include basic word features, part-of-speech (POS) features, position features and dependency syntactic features. Different combinations of these features are investigated. Experimental results are analyzed and future directions are discussed.

Key words: Chinese opinion mining| conditional random field (CRF)| semantic role labelling| Chinese car reviews

摘要: Nowadays, the Internet has penetrated into all aspects of people’s lives. A large number of online customer reviews have been accumulated in several product forums, which are valuable resources to be analyzed. However, these customer reviews are unstructured textual data, in which a lot of ambiguities exist, so analyzing them is a challenging task. At present, the effective deep semantic or fine-grained analysis of customer reviews is rare in the existing literature, and the analysis quality of most studies is also low. Therefore, in this paper a fine-grained opinion mining method is introduced to extract the detailed semantic information of opinions from multiple perspectives and aspects from Chinese automobile reviews. The conditional random field (CRF) model is used in this method, in which semantic roles are divided into two groups. One group relates to the objects being reviewed, which includes the roles of manufacturer, the brand, the type, and the aspects of cars. The other group of semantic roles is about the opinions of the objects, which includes the sentiment description, the aspect value, the conditions of opinions and the sentiment tendency. The overall framework of the method includes three major steps. The first step distinguishes the relevant sentences with the irrelevant sentences in the reviews. At the second step the relevant sentences are further classified into different aspects. At the third step fine-grained semantic roles are extracted from sentences of each aspect. The data used in the training process is manually annotated in fine granularity of semantic roles. The features used in this CRF model include basic word features, part-of-speech (POS) features, position features and dependency syntactic features. Different combinations of these features are investigated. Experimental results are analyzed and future directions are discussed.

关键词: Chinese opinion mining| conditional random field (CRF)| semantic role labelling| Chinese car reviews

CLC Number: