We used random sample consensus and distance cluster to segment object instead of sliding windows. In recognition step, we designed a new algorithm to extract point cloud feature. Firstly, the point cloud of objects was converted to depth map, then k-Means is applied to learn features from random patches. The learned feature can be used as the convolutional neural network (CNN) filters and convolved over the input image to extract convolutional feature. The presented method was tested by using two public datasets. The results showed that feature learned by single layer CNN can achieve higher recognition rate than artificially designed feature.
LI Yangyang,SHI Licheng,WAN Weibing,ZHAO Qunfei
. A Convolutional Neural Network-Based Method for 3D Object Detection[J]. Journal of Shanghai Jiaotong University, 2018
, 52(1)
: 7
-12
.
DOI: 10.16183/j.cnki.jsjtu.2018.01.002
[1]SCHAAL S. The new robotics-towards human-centered machines[J]. HFSP Journal Frontiers of Interdisciplinary Research in the Life Sciences, 2007, 1(2): 115-126.
[2]HUAI J, ZHANG Y, YILMAZ A. Real-time large scale 3D reconstruction by fusing Kinect and IMU data[J]. ISPRS Annals of Photogrammetry, Remote Sensing & Spatial Information Sciences, 2015, II-3/W5, 491-496.
[3]WHELAN T, KASESS M, JOHANNSSON H, et al. Real-time large-scale dense RGB-D SLAM with volumetric fusion[J]. The International Journal of Robotics Research, 2015, 34(4/5): 598-626.
[4]LAI K, BO L, REN X, et al. Detection-based object labeling in 3D scenes[C]∥International Conference on Robotics and Automation (ICRA). Minnesota, USA: IEEE, 2012: 1330-1337.
[5]LAI K, BO L, FOX D. Unsupervised feature learning for 3D scene labeling[C]∥International Conference on Robotics and Automation (ICRA). Hong Kong, China: IEEE, 2014: 3050-3057.
[6]SONG S, XIAO J. Sliding shapes for 3D object detection in depth images[C]∥Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer International Publishing, 2014: 634-651.
[7]ERHAN D, SZEGEDY C, TOSHEV A, et al. Scalable object detection using deep neural networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 2147-2154.
[8]RUSU R B, BLODOW N, BEETZ M. Fast point feature histograms (FPFH) for 3D registration[C]∥International Conference on Robotics and Automation. Kobe, Japan: IEEE, 2009: 3212-3217.
[9]RUSU R B, BRADSKI G, THIBAUX R, et al. Fast 3D recognition and pose using the viewpoint feature histogram[C]∥International Conference on Intelligent Robots and Systems (IROS). Taipei, China: IEEE, 2010: 2155-2162.
[10]ALDOMA A, VINCZE M, BLODOW N, et al. CAD-model recognition and 6DOF pose estimation using 3D cues[C]∥International Conference on Computer Vision Workshops. Barcelona, Spain: IEEE, 2011: 585-592.
[11]BO L F, REN X F, FOX D. Depth kernel descriptors for object recognition[C]∥International Conference on Intelligent Robots and Systems. San Francisco, USA: IEEE, 2011: 821-826.
[12]湛宁. 多特征和 SVM 相融合的三维物体识别方法[J]. 计算机仿真, 2013, 30(3): 380-383.
ZHAN Ning. Three-dimensional object recognition method based on multiple features and support vector machine[J]. Computer Simulation, 2013, 30(3): 380-383.
[13]LAI K, BO L F, REN X F, et al. A large-scale hierarchical multi-view RGB-D object dataset[C]∥International Conference on Robotics and Automation. Shanghai, China: IEEE, 2011: 1817-1824.
[14]WOHLKINGER W, ALDOMA A, RUSU R B, et al. 3DNet: Large-scale object class recognition from CAD models[C]∥International Conference on Robotics and Automation. Saint Paul, USA: IEEE, 2012: 5384-5391.
[15]COATES A, NG A Y. Learning feature representations with K-means[J]. Lecture Notes in Computer Science, 2012, 7700: 561-580.
[16]COATES A, NG A, LEE H. An analysis of single-layer networks in unsupervised feature learning[C]∥Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. 2011: 215-223.
[17]KRIZHEVSKY A, SUTSKEEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60 (6): 84-90.