Assembly process documents record the designers’ intention or knowledge. However, common knowledge extraction methods are not well suitable fo assembly process documents, because of its tabular form and unstructured natural language texts. In this paper, an assembly semantic entity recognition and relation construction method oriented to assembly process documents is proposed. First, the assembly process sentences are extracted from the table through concerned region recognition and cell division, and they will be stored as a key-value object file. Then, the semantic entities in the sentence are identified through the sequence tagging model based on the specific attention mechanism for assembly operation type. The syntactic rules are designed for realizing automatic construction of relation between entities. Finally, by using the self-constructed corpus, it is proved that the sequence tagging model in the proposed method performs better than the mainstream named entity recognition model when handling assembly process design language. The effectiveness of the proposed method is also analyzed through the simulation experiment in the small-scale real scene, compared with manual method. The results show that the proposed method can help designers accumulate knowledge automatically and efficiently.
GU Xinghai顾星海),HUA Bao(花 豹),LIU Yahui(刘亚辉),SUN Xuemin(孙学民),BAO Jinsong∗(鲍劲松)
. Semantic Entity Recognition and Relation Construction Method for Assembly Process Document[J]. Journal of Shanghai Jiaotong University(Science), 2024
, 29(3)
: 537
-556
.
DOI: 10.1007/s12204-022-2474-x
[1] CHEN J H, JIA X L. An approach for assembly process case discovery using multimedia information source [J]. Computers in Industry, 2020, 115: 103176.
[2] ZHANG S Y, GE W Q, WANG Z L, et al. A heuristic configuration solving process planning method for mechanical product configuration by imitating the crystal crystallization process [J]. The International Journal of Advanced Manufacturing Technology, 2021, 116(1/2): 611-628.
[3] KRETSCHMER R, PFOUGA A, RULHOFF S, et al. Knowledge-based design for assembly in agile manufacturing by using Data Mining methods [J]. Advanced Engineering Informatics, 2017, 33: 285-299.
[4] KUTIN A, DOLGOV V, SEDYKH M, et al. Integration of different computer-aided systems in product designing and process planning on digital manufacturing [J]. Procedia CIRP, 2018, 67: 476-481.
[5] OTTER D W, MEDINA J R, KALITA J K. A survey of the usages of deep learning for natural language processing [J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(2): 604-624.
[6] JI S X, PAN S R, CAMBRIA E, et al. A survey on knowledge graphs: Representation, acquisition, and applications [J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(2): 494-514.
[7] ZHAO M X, WANG H, GUO J, et al. Construction of an industrial knowledge graph for unstructured Chinese text learning [J]. Applied Sciences, 2019, 9(13): 2720.
[8] CHEN Z Y, BAO J S, ZHENG X H, et al. Semantic recognition method of assembly process based on LSTM [J]. Computer Integrated Manufacturing Systems, 2021(6): 1582-1593 (in Chinese).
[9] LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition [C]//2016 Conference of the North American Chapter of the Association for Computational Linguistics. San Diego, CA: ACL, 2016: 260-270.
[10] DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [DB/OL]. (2019-05-24). https://arxiv.org/abs/1810.04805.
[11] DAI Z J, WANG X T, NI P, et al. Named entity recognition using BERT BiLSTM CRF for Chinese electronic health records [C]//2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics. Suzhou: IEEE, 2019: 1-5.
[12] JIA C, SHI Y F, YANG Q R, et al. Entity enhanced BERT pre-training for Chinese NER [C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2020: 6384-6396.
[13] SYED M H, CHUNG S T. MenuNER: domain-adapted BERT based NER approach for a domain with limited dataset and its application to food menu domain [J]. Applied Sciences, 2021, 11(13): 6007.
[14] ZHANG J Y, HE G H, DAI Z, et al. Named entity recognition of enterprise annual report integrated with BERT [J]. Journal of Shanghai Jiao Tong University, 2021, 55(2): 117-123 (in Chinese).
[15] CHEN L, XU S, ZHU L J, et al. A deep learning based method for extracting semantic information from patent documents [J]. Scientometrics, 2020,125(1): 289-312.
[16] GIORGI J, WANG X D, SAHAR N, et al. End-toend named entity recognition and relation extraction using pre-trained language models [DB/OL]. (2019-12-20). https://arxiv.org/abs/1912.13415.
[17] ZHENG S C, WANG F, BAO H Y, et al. Joint extraction of entities and relations based on a novel tagging scheme [C]//55th Annual Meeting of the Association for Computational Linguistics. Vancouver: ACL, 2017: 1227-1236.
[18] WEI Z P, SU J L, WANG Y, et al. A novel cascade binary tagging framework for relational triple extraction [C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020: 1476-1488.
[19] LI J, SUN A X, HAN J L, et al. A survey on deep learning for named entity recognition [J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(1): 50-70.
[20] ZHANG N N, WANG P Y, ZHANG G P. Named entity deep learning recognition method for process operation description text [J]. Computer Applications and Software, 2019, 36(11): 188-195 (in Chinese).
[21] IMPEDOVO S, OTTAVIANO L, OCCHINEGRO S. Optical character recognition — a survey [J]. International Journal of Pattern Recognition and Artificial Intelligence, 1991, 5(1n02): 1-24.
[22] BOUKHAROUBA A. A new algorithm for skew correction and baseline detection based on the randomized Hough Transform [J]. Journal of King Saud
University-Computer and Information Sciences, 2017, 29(1): 29-38.