Journal of Shanghai Jiao Tong University ›› 2021, Vol. 55 ›› Issue (2): 117-123.doi: 10.16183/j.cnki.jsjtu.2020.009

Special Issue: 《上海交通大学学报》2021年12期专题汇总专辑 《上海交通大学学报》2021年“自动化技术、计算机技术”专题

Previous Articles     Next Articles

Named Entity Recognition of Enterprise Annual Report Integrated with BERT

ZHANG Jingyi1, HE Guanghui1(), DAI Zhou2, LIU Yadong1   

  1. 1.School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
    2.China Southern Power Grid Materials Co. , Ltd. , Guangzhou 510641, China
  • Received:2020-01-08 Online:2021-02-01 Published:2021-03-03
  • Contact: HE Guanghui E-mail:guanghui.he@sjtu.edu.cn

Abstract:

Automatically extracting key data from annual reports is an important means of business assessments. Aimed at the characteristics of complex entities, strong contextual semantics, and small scale of key entities in the field of corporate annual reports, a BERT-BiGRU-Attention-CRF model was proposed to automatically identify and extract entities in the annual reports of enterprises. Based on the BiGRU-CRF model, the BERT pre-trained language model was used to enhance the generalization ability of the word vector model to capture long-range contextual information. Furthermore, the attention mechanism was used to fully mine the global and local features of the text. The experiment was performed on a self-constructed corporate annual report corpus, and the model was compared with multiple sets of models. The results show that the value of F1 (harmonic mean of precision and recall) of the BERT-BiGRU-Attention-CRF model is 93.69%. The model has a better performance than other traditional models in annual reports, and is expected to provide an automatic means for enterprise assessments.

Key words: named entity recognition, enterprise annual report, BERT, attention mechanism, BiGRU

CLC Number: