Journal of Shanghai Jiao Tong University

Previous Articles     Next Articles

Multimodal Feature Fusion for Ship CAD Model Retrieval Technology

  

  1. College of Mechanical Engineering, Donghua University, Shanghai 201620, China

Abstract:

This paper This paper proposes a multimodal feature fusion method for ship CAD (Computer-Aided Design) model retrieval, aiming at solving the problem of limited single-modal expression ability in traditional ship CAD model retrieval - 3D geometric features are difficult to capture semantic information, text descriptions are unable to express the precise Geometric structure and image features are significantly affected by changes in viewing angle and illumination. The method maps reference images to pseudo-word tokens by referring to the Context-I2W network, and fuses BOM (Bill of Materials) information and mesh geometric features of CAD models; and designs a multimodal feature fusion framework based on the WR (Weighted Residual) matrix to align image, text and 3D geometric features in the semantic space; Constructing a combinatorial query mechanism for matching retrieval by calculating the similarity between combinatorial embeddings and candidate model features. The experiments are validated on a dataset containing 204 ship multimodal samples, and the results show that the method achieves an average mAP (Mean Average Precision) of 83.5% on the retrieval task of three types of typical components, namely, hull structure, outfitting parts, and piping arrangement, which is a 16.7% enhancement over the existing zero-sample methods, and the area under the ROC (Receiver Operating Characteristic) curve area under the curve reaches 0.818, which achieves excellent retrieval performance without labeling data.

Key words: Multimodal Retrieval, Contextual Feature Mapping, Cross-Modal Alignment, Zero-Shot Learning, CLIP Model

CLC Number: