基于卷积神经网络的三维物体检测方法

LI Yangyang; SHI Licheng; WAN Weibing; ZHAO Qunfei

doi:10.16183/j.cnki.jsjtu.2018.01.002

上海交通大学学报 >

2018 , Vol. 52 >Issue 1: 7 - 12

DOI: https://doi.org/10.16183/j.cnki.jsjtu.2018.01.002

学报（中文）

基于卷积神经网络的三维物体检测方法

LI Yangyang ,
SHI Licheng ,
WAN Weibing ,
ZHAO Qunfei

展开

上海交通大学自动化系，上海 200240

网络出版日期: 2018-01-01

收起

A Convolutional Neural Network-Based Method for 3D Object Detection

李洋洋，史历程，万卫兵，赵群飞

Expand

Department of Automation, Shanghai Jiao Tong University, Shanghai 200240, China

Online published: 2018-01-01

Fold

摘要

提出了一种新的三维物体检测方法.在物体定位部分，采用随机采样一致和欧式聚类算法分割三维物体点云以减少计算量；在物体识别部分，将物体点云转化为深度图像，利用k-Means聚类算法学习卷积核，利用卷积网络提取卷积特征，从而提高图像的识别率，并在2个公开的三维物体数据集上对所提出的特征提取算法进行测试.结果表明，与传统的点云特征提取方法相比，基于卷积网络的特征提取方法的识别率较高.

关键词： 服务机器人；三维物体；检测； k-Means聚类算法；卷积神经网络

本文引用格式

LI Yangyang , SHI Licheng , WAN Weibing , ZHAO Qunfei . 基于卷积神经网络的三维物体检测方法[J]. 上海交通大学学报, 2018 , 52(1) : 7 -12 . DOI: 10.16183/j.cnki.jsjtu.2018.01.002

Abstract

We used random sample consensus and distance cluster to segment object instead of sliding windows. In recognition step, we designed a new algorithm to extract point cloud feature. Firstly, the point cloud of objects was converted to depth map, then k-Means is applied to learn features from random patches. The learned feature can be used as the convolutional neural network (CNN) filters and convolved over the input image to extract convolutional feature. The presented method was tested by using two public datasets. The results showed that feature learned by single layer CNN can achieve higher recognition rate than artificially designed feature.

Key words： service robot; 3D object; detection; k-Means clustering algorithm; convolutional neural network (CNN)

参考文献

［1］SCHAAL S. The new robotics-towards human-centered machines［J］. HFSP Journal Frontiers of Interdisciplinary Research in the Life Sciences, 2007, 1(2): 115-126. ［2］HUAI J, ZHANG Y, YILMAZ A. Real-time large scale 3D reconstruction by fusing Kinect and IMU data［J］. ISPRS Annals of Photogrammetry, Remote Sensing & Spatial Information Sciences, 2015, II-3/W5, 491-496. ［3］WHELAN T, KASESS M, JOHANNSSON H, et al. Real-time large-scale dense RGB-D SLAM with volumetric fusion［J］. The International Journal of Robotics Research, 2015, 34(4/5): 598-626. ［4］LAI K, BO L, REN X, et al. Detection-based object labeling in 3D scenes［C］∥International Conference on Robotics and Automation (ICRA). Minnesota, USA: IEEE, 2012: 1330-1337. ［5］LAI K, BO L, FOX D. Unsupervised feature learning for 3D scene labeling［C］∥International Conference on Robotics and Automation (ICRA). Hong Kong, China: IEEE, 2014: 3050-3057. ［6］SONG S, XIAO J. Sliding shapes for 3D object detection in depth images［C］∥Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer International Publishing, 2014: 634-651. ［7］ERHAN D, SZEGEDY C, TOSHEV A, et al. Scalable object detection using deep neural networks［C］∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 2147-2154. ［8］RUSU R B, BLODOW N, BEETZ M. Fast point feature histograms (FPFH) for 3D registration［C］∥International Conference on Robotics and Automation. Kobe, Japan: IEEE, 2009: 3212-3217. ［9］RUSU R B, BRADSKI G, THIBAUX R, et al. Fast 3D recognition and pose using the viewpoint feature histogram［C］∥International Conference on Intelligent Robots and Systems (IROS). Taipei, China: IEEE, 2010: 2155-2162. ［10］ALDOMA A, VINCZE M, BLODOW N, et al. CAD-model recognition and 6DOF pose estimation using 3D cues［C］∥International Conference on Computer Vision Workshops. Barcelona, Spain: IEEE, 2011: 585-592. ［11］BO L F, REN X F, FOX D. Depth kernel descriptors for object recognition［C］∥International Conference on Intelligent Robots and Systems. San Francisco, USA: IEEE, 2011: 821-826. ［12］湛宁. 多特征和 SVM 相融合的三维物体识别方法［J］. 计算机仿真, 2013, 30(3): 380-383. ZHAN Ning. Three-dimensional object recognition method based on multiple features and support vector machine［J］. Computer Simulation, 2013, 30(3): 380-383. ［13］LAI K, BO L F, REN X F, et al. A large-scale hierarchical multi-view RGB-D object dataset［C］∥International Conference on Robotics and Automation. Shanghai, China: IEEE, 2011: 1817-1824. ［14］WOHLKINGER W, ALDOMA A, RUSU R B, et al. 3DNet: Large-scale object class recognition from CAD models［C］∥International Conference on Robotics and Automation. Saint Paul, USA: IEEE, 2012: 5384-5391. ［15］COATES A, NG A Y. Learning feature representations with K-means［J］. Lecture Notes in Computer Science, 2012, 7700: 561-580. ［16］COATES A, NG A, LEE H. An analysis of single-layer networks in unsupervised feature learning［C］∥Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. 2011: 215-223. ［17］KRIZHEVSKY A, SUTSKEEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks［J］. Communications of the ACM, 2017, 60 (6): 84-90.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献