Journal of shanghai Jiaotong University (Science) ›› 2013, Vol. 18 ›› Issue (4): 418-424.doi: 10.1007/s12204-013-1416-z

Previous Articles     Next Articles

Thread Labeling for News Event

Thread Labeling for News Event

YAN Ze-hua (闫泽华), LI Fang* (李 芳)   

  1. (Department of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200240, China)
  2. (Department of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200240, China)
  • Online:2013-08-28 Published:2013-08-12
  • Contact: LI Fang(李 芳) E-mail:fli@sjtu.edu.cn

Abstract: Automatic thread labeling for news events can help people know different aspects of a news event. In this paper, we present a method to label threads of a news event. We use latent Dirichlet allocation (LDA) topic model to extract news threads from news corpus. Our method first selects the thread words subset then extracts phrases based on co-occurrence calculation. The extracted phrase is then used as a label of a news thread. Experimental results show that about 60% of generated labels visualize the meaningful aspects of a news event. These labels can help people fast to capture many different aspects of a news event.

Key words: news event| topic labeling| latent Dirichlet allocation (LDA)

摘要: Automatic thread labeling for news events can help people know different aspects of a news event. In this paper, we present a method to label threads of a news event. We use latent Dirichlet allocation (LDA) topic model to extract news threads from news corpus. Our method first selects the thread words subset then extracts phrases based on co-occurrence calculation. The extracted phrase is then used as a label of a news thread. Experimental results show that about 60% of generated labels visualize the meaningful aspects of a news event. These labels can help people fast to capture many different aspects of a news event.

关键词: news event| topic labeling| latent Dirichlet allocation (LDA)

CLC Number: