Unsupervised Oral Endoscope Image Stitching Algorithm

doi:10.1007/s12204-022-2513-7

Abstract

Abstract: Oral endoscope image stitching algorithm is studied to obtain wide-field oral images through registration and stitching, which is of great significance for auxiliary diagnosis. Compared with natural images, oral images have lower textures and fewer features. However, traditional feature-based image stitching methods rely heavily on feature extraction quality, often showing an unsatisfactory performance when stitching images with few features. Moreover, due to the hand-held shooting, there are large depth and perspective disparities between the captured images, which also pose a challenge to image stitching. To overcome the above problems, we propose an unsupervised oral endoscope image stitching algorithm based on the extraction of overlapping regions and the loss of deep features. In the registration stage, we extract the overlapping region of the input images by sketching polygon intersection for feature points screening and estimate homography from coarse to fine on a three-layer feature pyramid structure. Moreover, we calculate loss using deep features instead of pixel values to emphasize the importance of depth disparities in homography estimation. Finally, we reconstruct the stitched images from feature to pixel, which can eliminate artifacts caused by large parallax. Our method is compared with both feature-based and previous deep-based methods on the UDIS-D dataset and our oral endoscopy image dataset. The experimental results show that our algorithm can achieve higher homography estimation accuracy, and better visual quality, and can be effectively applied to oral endoscope image stitching.

Key words: oral endoscope image, overlapping region, homography estimation, image stitching

摘要： 口腔内窥镜图像拼接算法通过配准、拼接等处理获取宽视野口腔图像以满足辅助诊断的需求。与自然图像相比，口腔内窥镜图像的纹理特征少。然而，传统的基于特征的图像拼接方法严重依赖于特征提取的质量，在拼接特征较少的图像时，往往无法令人满意。此外，由于手持拍摄，拍摄的图像之间存在较大的深度和视角差异，这也给图像拼接带来了挑战。为了克服上述问题，提出了一种基于重叠区域提取和深度特征丢失的无监督口腔内窥镜图像拼接算法。在配准阶段，通过绘制多边形交点来提取输入图像的重叠区域进行特征点筛选，并在三层特征金字塔结构上由粗到精进行单应性估计。此外，使用深度特征而不是像素值来计算损失，以强调深度差异在单应性估计中的重要性。最后，对拼接后的图像进行从特征到像素的重构，消除了视差过大带来的伪影。我们的方法在UDIS-D数据集和我们的口腔内窥镜图像数据集上与基于特征和先前基于深度的方法进行了比较。实验结果表明，该算法具有较高的单应性估计精度和较好的视觉质量，可有效应用于口腔内窥镜图像拼接。

关键词: 口腔内窥镜图像，重叠区域，单应性估计，图像拼接

CLC Number:

TP391
R78

HUANG Rong (黄荣), CHANG Qing^∗ (常青), ZHANG Yang （张扬）. Unsupervised Oral Endoscope Image Stitching Algorithm[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(1): 81-90.

References

[1] LIU J, WU H L. A new image registration method based on frame and gray information [C]//2012 International Conference on Computer Distributed Control and Intelligent Environmental Monitoring. Zhangjiajie: IEEE, 2012: 48-51.
[2] QIN B J, GU Z J, SUN X J, et al. Registration of images with outliers using joint saliency map [J]. IEEE Signal Processing Letters, 2010, 17(1): 91-94.
[3] YAN Y Z, ZHENG Y B, XU W Y, et al. Local dominant orientation based mutual information for multisensor template matching [C]//2011 Sixth International Conference on Image and Graphics. Hefei: IEEE, 2011: 538-542.
[4] ZHANG F, LIU F. Parallax-tolerant image stitching [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 3262-3269.
[5] GAO J, LI Y, CHIN T J, et al. Seam-driven image stitching [C]//34th Annual Conference of the European Association for Computer Graphics. Girona: ViRVIG, 2013: 45-48.
[6] ZARAGOZA J, CHIN T J, TRAN Q H, et al. As-projective-as-possible image stitching with moving DLT [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1285-1298.
[7] CHANG C H, SATO Y, CHUANG Y Y. Shapepreserving half-projective warps for image stitching [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 3254- 3261.
[8] LIN C C, PANKANTI S U, RAMAMURTHY K N, et al. Adaptive as-natural-as-possible image stitching [C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1155-1163.
[9] CHEN Y S, CHUANG Y Y. Natural image stitching with the global similarity prior [M]//Computer vision– ECCV 2016. Cham: Springer, 2016: 186-201.
[10] XIANG T Z, XIA G S, ZHANG L P, et al. Locally warping-based image stitching by imposing line constraints [C]//2016 23rd International Conference on Pattern Recognition. Cancun: IEEE, 2016: 4178-4183.
[11] HE C, ZHOU J. Mesh-based image stitching algorithm with linear structure protection [J]. Journal of Image and Graphics, 2018, 23(7): 973-983 (in Chinese).
[12] ROCCO I, ARANDJELOVIC R, SIVIC J. Convolutional neural network architecture for geometric matching [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(11): 2553-2567.
[13] MELEKHOV I, TIULPIN A, SATTLER T, et al. DGC-net: Dense geometric correspondence network [C]//2019 IEEE Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2019: 1034-1042.
[14] SHEN C W, JI X Y, MIAO C L. Real-time image stitching with convolutional neural networks [C]//2019 IEEE International Conference on Real-time Computing and Robotics. Irkutsk: IEEE, 2019: 192-197.
[15] NIE L, LIN C Y, LIAO K, et al. A view-free image stitching network based on global homography [J]. Journal of Visual Communication and Image Representation, 2020, 73: 102950.
[16] NGUYEN T, CHEN S W, SHIVAKUMAR S S, et al. Unsupervised deep homography: A fast and robust homography estimation model [J]. IEEE Robotics and Automation Letters, 2018, 3(3): 2346-2353.
[17] ZHANG J, WANG C, LIU S, et al. Contentaware unsupervised deep homography estimation [M]//Computer vision – ECCV 2020. Cham: Springer, 2020: 653-669.
[18] WANG S, YUAN F Y, CHEN B, et al. Deep homography estimation based on attention mechanism [C]//2021 7th International Conference on Systems and Informatics. Chongqing: IEEE, 2021: 1-6.
[19] NIE L, LIN C Y, LIAO K, et al. Unsupervised deep image stitching: Reconstructing stitched features to images [J]. IEEE Transactions on Image Processing, 2021, 30: 6184-6197.
[20] JIANG Z W, WANG S D, WANG L L, et al. A method of Delaunay triangulation based on grid and direction index [J]. Engineering of Surveying and Mapping, 2014, 23(2): 57-60 (in Chinese).
[21] ZHANG L, HE F, LI H. Detection method for point within polygon based on singular ray method [J]. Application Research of Computers, 2020, 37(S2): 133-135 (in Chinese).
[22] XIANG Z K, LI M, XIAO L, et al. Deformable registration of chest radiographs using B-spline based method with modified residual complexity [J]. Journal of Shanghai Jiao Tong University (Science), 2019, 24(2): 226-232.
[23] NIE L, LIN C Y, LIAO K, et al. Learning edge-preserved image stitching from largebaseline deep homography [DB/OL]. (2020-12-11). https://arxiv.org/abs/2012.06194.
[24] JADERBERG M, SIMONYAN K A. Zisserman, et al. Spatial transformer networks [M]//Advances in neural information processing systems. Red Hook: Curran Associates, 2015: 2017-2025.
[25] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context [M]//Computer vision – ECCV 2014. Cham: Springer, 2014: 740-755.