融入BERT的企业年报命名实体识别方法

展开
  • 1.上海交通大学 电子信息与电气工程学院,上海  200240
    2.南方电网物资有限公司,广州  510641
张靖宜(1996-),女,河南省南阳市人,硕士生,主要从事自然语言处理的研究.

收稿日期: 2020-01-08

  网络出版日期: 2021-03-03

Named Entity Recognition of Enterprise Annual Report Integrated with BERT

Expand
  • 1.School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
    2.China Southern Power Grid Materials Co. , Ltd. , Guangzhou 510641, China

Received date: 2020-01-08

  Online published: 2021-03-03

摘要

自动提取企业年报关键数据是企业评价工作自动化的重要手段.针对企业年报领域关键实体结构复杂、与上下文语义关联强、规模较小的特点,提出基于转换器的双向编码器表示-双向门控循环单元-注意力机制-条件随机场(BERT-BiGRU-Attention-CRF)模型.在BiGRU-CRF模型的基础上,首先引入BERT预训练语言模型,以增强词向量模型的泛化能力,捕捉长距离的上下文信息;然后引入注意力机制,以充分挖掘文本的全局和局部特征.在自行构建的企业年报语料库内进行实验,将该模型与多组传统模型进行对比.结果表明:该模型的F1值(精确率和召回率的调和平均数)为93.69%,对企业年报命名实体识别性能优于其他传统模型,有望成为企业评价工作自动化的有效方法.

本文引用格式

张靖宜, 贺光辉, 代洲, 刘亚东 . 融入BERT的企业年报命名实体识别方法[J]. 上海交通大学学报, 2021 , 55(2) : 117 -123 . DOI: 10.16183/j.cnki.jsjtu.2020.009

Abstract

Automatically extracting key data from annual reports is an important means of business assessments. Aimed at the characteristics of complex entities, strong contextual semantics, and small scale of key entities in the field of corporate annual reports, a BERT-BiGRU-Attention-CRF model was proposed to automatically identify and extract entities in the annual reports of enterprises. Based on the BiGRU-CRF model, the BERT pre-trained language model was used to enhance the generalization ability of the word vector model to capture long-range contextual information. Furthermore, the attention mechanism was used to fully mine the global and local features of the text. The experiment was performed on a self-constructed corporate annual report corpus, and the model was compared with multiple sets of models. The results show that the value of F1 (harmonic mean of precision and recall) of the BERT-BiGRU-Attention-CRF model is 93.69%. The model has a better performance than other traditional models in annual reports, and is expected to provide an automatic means for enterprise assessments.

参考文献

[1] ZHENG S C, WANG F, BAO H Y, et al. Joint extraction of entities and relations based on a novel tagging scheme[C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. (Volume 1: Long Papers). Stroudsburg, PA, USA: ACL, 2017: 1227-1236.
[2] 吴文涛,李培峰,朱巧明.基于混合神经网络的实体和事件联合抽取方法[J]. 中文信息学报,2019, 33(8): 77-83.
[2] WU Wentao, LI Peifeng, ZHU Qiaoming. Joint extraction of entities and events by a hybrid neural network[J]. Journal of Chinese Information Processing, 2019, 33(8): 77-83.
[3] HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[OL]. (2015-08-09)[2019-11-20]. .
[4] CHIU J P C, NICHOLS E. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4(1): 357-370.
[5] MA X Z, HOVY E. End-to-end sequence labeling via Bi-directional LSTM-CNNs-CRF[C]∥Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. (Volume 1: Long Papers). Stroudsburg, PA, USA: ACL, 2016: 1064-1074.
[6] CHO K, VAN M B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]∥Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA: ACL, 2014: 1724-1734.
[7] 王洁,张瑞东,吴晨生.基于GRU的命名实体识别方法[J]. 计算机系统应用,2018, 27(9): 18-24.
[7] WANG Jie, ZHANG Ruidong, WU Chensheng. Named entity recognition method based on GRU[J]. Computer Systems & Applications, 2018, 27(9): 18-24.
[8] BHARADWAJ A, MORTENSEN D, DYER C, et al. Phonologically aware neural model for named entity recognition in low resource transfer settings[C]∥Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: ACL, 2016: 1462-1472.
[9] CAO P F, CHEN Y B, LIU K, et al. Adversarial transfer learning for Chinese named entity recognition with self-attention mechanism[C]∥Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: ACL, 2018: 182-192.
[10] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]∥Proceedings of the 31st Conference on Neural Information Processing Systems. Long Beach, CA, USA: NIPS, 2017: 5998-6008.
[11] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[OL]. (2019-05-24)[2019-11-25]. .
[12] WITTEN I H, PAYNTER G W, FRANK E, et al. KEA: Practical automatic keyphrase extraction[C]∥Proceedings of the Fourth ACM Conference on Digital Libraries. New York, USA: ACM, 1999: 254-255.
[13] KANDOLA E J, HOFMANN T, POGGIO T, et al. A neural probabilistic language model[J]. Studies in Fuzziness and Soft Computing, 2006, 194: 137-186.
[14] KIROS R, ZHU Y K, SALAKHUTDINOV R R, et al. Skip-thought vectors[C]∥Proceedings of the 29th Conference on Neural Information Processing Systems. Montreal, Canada: NIPS, 2015: 3294-3302.
[15] STRUBELL E, VERGA P, BELANGER D, et al. Fast and accurate entity recognition with iterated dilated convolutions[C]∥Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: ACL, 2017: 2670-2680.
文章导航

/