J Shanghai Jiaotong Univ Sci

Cover and Table of Contents

2023, 28 (6): 0.

Abstract ( 253 )

PDF (48270KB) ( 25 )

Related Articles | Metrics

Entity Relationship Explanation via Conceptualization

XIE Chenhao(谢晨昊), LIANG Jiaqing(梁家卿), XIA Yanghua(肖仰华), HWANG Seung-won

2023, 28 (6): 695-702.
doi: 10.1007/s12204-021-2394-1

Abstract ( 889 )

PDF (608KB) ( 261 )

Finding an attribute to explain the relationships between a given pair of entities is valuable in many applications. However, many direct solutions fail, owing to its low precision caused by heavy dependence on text and low recall by evidence scarcity. Thus, we propose a generalization-and-inference framework and implement it to build a system: entity-relationship finder (ERF). Our main idea is conceptualizing entity pairs into proper concept pairs, as intermediate random variables to form the explanation. Although entity conceptualization has been studied, it has new challenges of collective optimization for multiple relationship instances, joint optimization for both entities, and aggregation of diluted observations into the head concepts defining the relationship. We propose conceptualization solutions and validate them as well as the framework with extensive experiments.

References | Related Articles | Metrics

Boosting Unsupervised Domain Adaptation with Soft Pseudo-Label and Curriculum Learning

ZHANG Shengjia(张晟嘉), LIN Tiancheng(林天成), XU Yi(徐奕)

2023, 28 (6): 703-716.
doi: 10.1007/s12204-022-2487-5

Abstract ( 909 )

PDF (963KB) ( 287 )

By leveraging data from a fully labeled source domain, unsupervised domain adaptation (UDA) improves classification performance on an unlabeled target domain through explicit discrepancy minimization of data distribution or adversarial learning. As an enhancement, category alignment is involved during adaptation to reinforce target feature discrimination by utilizing model prediction. However, there remain unexplored problems about pseudo-label inaccuracy incurred by wrong category predictions on target domain, and distribution deviation caused by overfitting on source domain. In this paper, we propose a model-agnostic two-stage learning framework, which greatly reduces flawed model predictions using soft pseudo-label strategy and avoids overfitting on source domain with a curriculum learning strategy. Theoretically, it successfully decreases the combined risk in the upper bound of expected error on the target domain. In the first stage, we train a model with distribution alignment-based UDA method to obtain soft semantic label on target domain with rather high confidence. To avoid overfitting on source domain, in the second stage, we propose a curriculum learning strategy to adaptively control the weighting between losses from the two domains so that the focus of the training stage is gradually shifted from source distribution to target distribution with prediction confidence boosted on the target domain. Extensive experiments on two well-known benchmark datasets validate the universal effectiveness of our proposed framework on promoting the performance of the top-ranked UDA algorithms and demonstrate its consistent superior performance.

References | Related Articles | Metrics

Multiple Detection Model Fusion Framework for Printed Circuit Board Defect Detection

WU Xingl(武星), ZHANG Qingfeng(张庆丰), WANG Jianjia(王健嘉), YAO Junfeng(姚骏峰), Guo Yike.(郭毅可)

2023, 28 (6): 717-727.
doi: 10.1007/s12204-022-2471-0

Abstract ( 881 )

PDF (1870KB) ( 263 )

The printed circuit board (PCB) is an indispensable component of electronic products, which determines the quality of these products. With the development and advancement of manufacturing technology, the layout and structure of PCB are getting complicated. However, there are few effective and accurate PCB defect detection methods. There are high requirements for the accuracy of PCB defect detection in the actual production environment, so we propose two PCB defect detection frameworks with multiple model fusion including the defect detection by multi-model voting method (DDMV) and the defect detection by multi-model learning method (DDML). With the purpose of reducing wrong and missing detection, the DDMV and DDML integrate multiple defect detection networks with different fusion strategies. The effectiveness and accuracy of the proposed framework are verified with extensive experiments on two open-source PCB datasets. The experimental results demonstrate that the proposed DDMV and DDML are better than any other individual state-of-the-art PCB defect detection model in F1-score, and the area under curve value of DDML is also higher than that of any other individual detection model. Furthermore, compared with DDMV, the DDML with an automatic machine learning method achieves the best performance in PCB defect detection, and the F1-score on the two datasets can reach 99.7% and 95.6% respectively.

References | Related Articles | Metrics

Cross-Modal Entity Resolution for Image and Text Integrating Global and Fine-Grained Joint Attention Mechanism

ZENG Zhirian(曾志贤),CAO Jianjun*(曹建军),WENG Nianfeng(翁年凤)，YUAN Zhen(袁震)，YU Xu(余旭)

2023, 28 (6): 728-737.
doi: 10.1007/s12204-022-2465-y

Abstract ( 703 )

PDF (1951KB) ( 386 )

In order to solve the problem that the existing cross-modal entity resolution methods easily ignore the high-level semantic informational correlations between cross-modal data, we propose a novel cross-modal entity resolution for image and text integrating global and fine-grained joint attention mechanism method. First, we map the cross-modal data to a common embedding space utilizing a feature extraction network. Then, we integrate global joint attention mechanism and fine-grained joint attention mechanism, making the model have the ability to learn the global semantic characteristics and the local fine-grained semantic characteristics of the cross-modal data, which is used to fully exploit the cross-modal semantic correlation and boost the performance of cross-modal entity resolution. Moreover, experiments on Flickr-30K and MS-COCO datasets show that the overall performance of R@sum outperforms by 4.30% and 4.54% compared with 5 state-of-the-art methods, respectively, which can fully demonstrate the superiority of our proposed method.

References | Related Articles | Metrics

Random Search and Code Similarity-Based Automatic Program Repair

CAO Heling,a,b (曹鹤玲),LIU Fangzhenga (刘方正)，SHI Jianshua (石建树),CHU Yonghea (楚永贺)，DENG Miaoleia*. (邓淼磊)

2023, 28 (6): 738-752.
doi: 10.1007/s12204-022-2514-6

Abstract ( 506 )

PDF (842KB) ( 207 )

In recent years, automatic program repair approaches have developed rapidly in the field of software engineering. However, the existing program repair techniques based on genetic programming suffer from requiring verification of a large number of candidate patches, which consume a lot of computational resources. In this paper, we propose a random search and code similarity based automatic program repair (RSCSRepair). First, to reduce the verification computation effort for candidate patches, we introduce test filtering to reduce the number of test cases and use test case prioritization techniques to reconstruct a new set of test cases. Second, we use a combination of code similarity and random search for patch generation. Finally, we use a patch overfitting detection method to improve the quality of patches. In order to verify the performance of our approach, we conducted the experiments on the Defects4J benchmark. The experimental results show that RSCSRepair correctly repairs up to 54 bugs, with improvements of 14.3%, 8.5%, 14.3% and 10.3% for our approach compared with jKali, Nopol, CapGen and SimFix, respectively

References | Related Articles | Metrics

Formal Analysis of SA-TEK 3-Way Handshake Protocols

XU Sen* (徐森),YANG Shuo (杨硕)，ZHANG Kefei (张克非)

2023, 28 (6): 753-762.
doi: 10.1007/s12204-021-2340-2

Abstract ( 498 )

PDF (1977KB) ( 193 )

IEEE 802.16 is the standard for broadband wireless access. The security sublayer is provided within IEEE 802.16 MAC layer for privacy and access control, in which the privacy and key management (PKM) protocols are specified. In IEEE 802.16e, SA-TEK 3-way handshake is added into PKM protocols, aiming to facilitate reauthentication and key distribution. This paper analyzes the SA-TEK 3-way handshake protocol, and proposes an optimized version. We also use CasperFDR, a popular formal analysis tool, to verify our analysis. Moreover, we model various simplified versions to find the functions of those elements in the protocol, and correct some misunderstandings in related works using other formal analysis tools.

References | Related Articles | Metrics

Off-Grid Sparse Bayesian Inference with Biased Total Grids for Dense Time Delay Estimation

WEI Shuang (魏爽)， LI Wenyao (李文瑶)，SU Ying* (苏颖)， LIU Rui (刘睿)

2023, 28 (6): 763-771.
doi: 10.1007/s12204-022-2464-z

Abstract ( 517 )

PDF (831KB) ( 203 )

For dense time delay estimation (TDE), when multiple time delays are located within a grid interval, it is difficult for the existing sparse Bayesian learning/inference (SBL/SBI) methods to obtain high estimation accuracy to meet the application requirements. To solve this problem, this paper proposes a method named off-grid sparse Bayesian inference - biased total grid (OGSBI-BTG), where a mesh evolution process is conducted to move the total grids iteratively based on the position of the off-grid between two grids. The proposed method updates the off-grid dictionary matrix by further reconstructing an optimum mesh and offsetting the off-grid vector. Experimental results demonstrate that the proposed approach performs better than other state-of-the-art SBI methods and multiple signal classification even when the grid interval is larger than the gap of true time delays. In this paper, the time domain model and frequency domain model of TDE are studied.

References | Related Articles | Metrics

Tail-Bound Cost Analysis over Nondeterministic Probabilistic Programs

WANG Peeixin(王培新)

2023, 28 (6): 772-782.
doi: 10.1007/s12204-022-2456-z

Abstract ( 545 )

PDF (654KB) ( 228 )

For probabilistic programs, there is some work for qualitative and quantitative analysis about expectation or mean, such as expected termination time, and expected cost analysis. However, another non-trivial issue is about tail bounds (i.e., upper bounds of tail probabilities), which can provide high-probability guarantees to extreme events. In this work, we focus on the problem of tail-bound cost analysis over nondeterministic probabilistic programs, which aims to automatically obtain the tail bound of resource usages over such programs. To achieve this goal, we present a novel approach, combined with a suitable concentration inequality, to derive the tail bound of accumulated cost until program termination. Our approach can handle both positive and negative costs. Moreover, our approach enables an automated template-based synthesis of supermartingales and leads to an efficient polynomial-time algorithm. To show the effectiveness of our approach, we present experimental results on various programs and make a comparison with state-of-the-art tools.

References | Related Articles | Metrics

CT Image Segmentation Method of Composite Material Based on Improved Watershed Algorithm and U-Net Neural Network Model

XUE Yongboa (薛永波)，LIU Zhaob (刘钊)， LI Zeyanga (李泽阳)，ZHU Pinga* (朱平)

2023, 28 (6): 783-792.
doi: 10.1007/s12204-021-2385-2

Abstract ( 854 )

PDF (1655KB) ( 238 )

In the study of the composite materials performance, X-ray computed tomography (XCT) scanning has always been one of the important measures to detect the internal structures. CT image segmentation technology will effectively improve the accuracy of the subsequent material feature extraction process, which is of great significance to the study of material performance. This study focuses on the low accuracy problem of image segmentation caused by fiber cross-section adhesion in composite CT images. In the core layer area, area validity is evaluated by morphological indicator and an iterative segmentation strategy is proposed based on the watershed algorithm. In the transition layer area, a U-net neural network model trained by using artificial labels is applied to the prediction of segmentation result. Furthermore, a CT image segmentation method for fiber composite materials based on the improved watershed algorithm and the U-net model is proposed. It is verified by experiments that the method has good adaptability and effectiveness to the CT image segmentation problem of composite materials, and the accuracy of segmentation is significantly improved in comparison with the original method, which ensures the accuracy and robustness of the subsequent fiber feature extraction process

References | Related Articles | Metrics

Stagewise Training for Hybrid-Distorted Image Restoration

HOU Shujuan* (侯舒娟)，ZHU Wenping (朱文萍)，LI Hai (李海)

2023, 28 (6): 793-801.
doi: 10.1007/s12204-022-2453-2

Abstract ( 706 )

PDF (1221KB) ( 284 )

Image restoration is the problem of restoring a real degraded image. Previous studies mostly focused on single distortion. However, most of the real images experience multiple distortions, and single distortion image restoration algorithms can not effectively improve the image quality. Moreover, few existing hybrid distortion image restoration algorithms can not deal with single distortion. Therefore, an end-to-end pipeline network based on stagewise training is proposed in this paper. Specifically, the network selects three typical image restoration tasks: denoising, inpainting, and super resolution. The whole training process is divided into single distortion training, hybrid distortion training of two types, and hybrid distortion training of three types. The design of loss function draws on the idea of deep supervision. Experimental results prove that the proposed method is not only superior to other methods in hybrid-distorted image restoration, but also suitable for single distortion image restoration.

References | Related Articles | Metrics

Color Prediction Model of Gray Hybrid Multifilament Fabric

WANG Yujuan1 (王玉娟)，LI Wengang2 (李文刚)，LIU .Jianyong3 (刘建勇),CHEN Guangxue4 (陈广学),WANG Jun1*(汪军)

2023, 28 (6): 802-808.
doi: 10.1007/s12204-021-2326-0

Abstract ( 516 )

PDF (705KB) ( 186 )

To facilitate the product design of hybrid multifilament fabric prior to spinning, a color prediction model was proposed. The monofilaments in the multifilament were assumed to have a square cross-section and stacked vertically. The prediction model considered the reflectance, transmittance and arrangement of the monofilaments in the fabric. To test the reflectance and transmittance of the monofilament with the Datacolor spectrophotometer, films with the same material and thickness as the monofilaments were made. Twenty kinds of multifilaments with different blending ratios and fineness were produced and woven into fabrics. The color difference between the fabric color tested by the spectrophotometer and predicted by the new model and classical Kubelka-Munk (K-M) theory was calculated and compared. The result shows that the average color difference obtained by the new model was 1.02 Color Measurement Committee (CMC) (2 : 1) units, which was less than that of 1.78 CMC (2 : 1) units obtained by the K-M theory. Through Spearman correlation analysis, the fabric lightness and the multifilament fineness had a significant influence on calculated color difference, and the color difference decreased with increases of them. Finally, the surface color of a fabric was reproduced, indicating the model can be used to characterize the phenomenon of uneven color mixing on the fabric surface.

References | Related Articles | Metrics

Predicting Stock Closing Price with Stock Network Public Opinion Based on AdaBoost-AAFSA-Elman Model and CEEMDAN Algorithm

ZHU Changsheng1 (朱昶胜)，KANG Lianghe1.3* (康亮河)，FENG Wenfang2 (冯文芳)

2023, 28 (6): 809-821.
doi: 10.1007/s12204-021-2337-x

Abstract ( 575 )

PDF (953KB) ( 594 )

To solve low prediction accuracy of Elman in predicting stock closing price, the model of adaptive boosting (AdaBoost)-improved artificial fish swarm algorithm (AAFSA)-Elman based on complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is proposed. By adding different white noise to the original data, CEEMDAN algorithm is used to decompose attributes serial selected by Boruta algorithm and text mining. To optimize the weight and threshold of Elman, self-adaption step length and view scope are used to improve artificial fish swarm algorithm (AFSA). AdaBoost algorithm is used to compose 5 weak AAFSA-Elman predictors into a strong predictor by continuous iteration. Experiments show that the mean absolute percentage error (MAPE) of AdaBoost-AAFSA-Elman model reduces from 4.9423% to 1.2338%. This study provides an experimental method for the prediction of stock closing price based on network public opinio.

References | Related Articles | Metrics

Energy-Efficient Bandwidth and Power Allocation in Relay-Assisted Multi-Layer Heterogeneous Networks with Energy Harvesting

GAO Jincheng (高锦程)，ZHAO Yisheng* (赵宜升)，CHEN Jiafa (陈加法)，CHEN Zhonghui (陈忠辉)

2023, 28 (6): 822-830.
doi: 10.1007/s12204-021-2336-y

Abstract ( 399 )

PDF (980KB) ( 211 )

Aiming at excessive users existing in a pico base station (PBS) in the multi-layer heterogeneous networks, the resource allocation problem of maximizing the energy efficiency of the networks is investigated in this paper. By deploying a relay node with energy harvesting function, the data of some users in the PBS can be transferred to an adjacent idle PBS. The bandwidth and transmitting power of users and the relay node are both considered to formulate the resource allocation optimization problem. The objective is to maximize the energy efficiency of the whole heterogeneous networks under the constraints of the user’s minimum data rate and energy consumption. The suboptimal solution is obtained by using the particle swarm optimization (PSO) algorithm and quantum-behaved particle swarm optimization (QPSO) algorithm. Simulation results show that the adopted methods have higher energy efficiency than the conventional fixed power and bandwidth method. In addition, the time complexity of the adopted methods is relatively low.

References | Related Articles | Metrics

Optimization of Group Multiattribute Decision-Making Model in Commercial Space Investment

ZHANG Yiming (张-鸣)，HOU Junjie1* (侯俊杰)，ZHONG Shaowen2 (钟少文)

2023, 28 (6): 831-840.
doi: 10.1007/s12204-021-2400-7

Abstract ( 437 )

PDF (291KB) ( 333 )

A group multiattribute decision-making model was proposed by implementing prospect theory, multiattribute decision-making, group decision-making and entropy methods for the optimization in commercial space investment. First, the decision-making function was decided using prospect theory by the preference of each expert to reach the comprehensive prospect value based on different investment options; second, expert decision weights were reached according to entropy method; third, the expert group decision-making information was congregated according to the group decision-making congregation algorithm to reach the most optimized investment option; finally, an example was given to demonstrate the feasibility and effectiveness of the method. This model comprehensively takes the advantages of many methods to congregate experts’ experiences and avoid the subjective influences, thus providing a scientific decision-making approach for the commercial space investment.

References | Related Articles | Metrics

Medicine-Engineering Interdisciplinary Research Based on Bibliometric Analysis: A Case Study on Medicine-Engineering Institutional Cooperation of Shanghai Jiao Tong University

WANG Qingwen (王庆稳)，CUI Tingting (崔婷婷)，DENG Peiwen* (邓珮雯)

2023, 28 (6): 841-856.
doi: 10.1007/s12204-022-2418-5

Abstract ( 912 )

PDF (1829KB) ( 2294 )

This article aims to provide reference for medicine-engineering interdisciplinary research. Targeted at the scientific literature and patent literature published by Shanghai Jiao Tong University, this article attempts to set up co-occurrence matrix of medicine-engineering institutional information which was extracted from address fields of the papers, so as to construct the medicine-engineering intersection datasets. The dataset of scientific literature was analyzed using bibliometrics and visualization methods from multiple dimensions, and the most active factors, such as trends of output, journal and subject distribution, were identified from the indicators of category normalized citation impact (CNCI), times cited, keywords, citation topics and the degree of medicineengineering interdisplinary. Research on hotspots and trends was discussed in detail. Analyses of the dataset of patent literature showed research themes and measured the degree for technology convergence of medicineengineering.

References | Related Articles | Metrics