In response to the challenges of inefficient data
management and difficult process reuse caused by the dispersion of multimodal
data and the lack of inter-document connections in the field of steam turbine
blade machining processes, this paper proposes a multimodal cross-document
graph modeling technique based on Large Language Models (LLM) and Technical
Language Processing (TLP). Initially, a multimodal graph ontology framework is
designed, followed by the application of TLP technology to address issues such
as complex terminology, non-standardized text, and cross-document semantic
inconsistencies in process documentation. To tackle the problems faced by LLMs
in understanding domain-specific process knowledge, including term ambiguity,
lack of contextual information, and difficulties in cross-document semantic
association, a template-based hierarchical prompt engineering method is
introduced to facilitate the automatic extraction and integration of process
information. The results show that TLP solves the problem of poor adaptability
in the field of LLM and significantly improves the accuracy of term
recognition. The context learning augmentation rules of LLM solve the problem
of weak TLP generalization ability and improve the ability to deal with new
terms, and the two are fused to form two-way enhancement. At the same time, the
constructed map can intuitively display the multimodal information and its
correlation of process documents, which significantly improves the efficiency
of knowledge integration and reuse.
HUANG Haoyang1, LI Fei2, ZHANG Hengjun2, YU Jiayi2, BAO Jinsong2+
. A
Multimodal Knowledge Graph Modeling Method for Cross-Process Documents via
Integration of LLM and TLP[J]. Journal of Shanghai Jiaotong University, 0
: 1
.
DOI: 10.16183/j.cnki.jsjtu.2025.068