Research on multi-object detection in driving scene of underground unmanned electric locomotive
-
摘要: 目前煤矿井下无人驾驶有轨电机车在行驶过程中,对轨道中的石块及其他小型障碍物的识别存在检测速度慢、检测精度低,且对于重叠目标,易造成漏检、错检等问题。针对上述问题,提出了一种井下电机车多目标检测模型−SE−HDC−Mask R−CNN模型。该模型基于Mask R−CNN进行改进,通过在主干特征提取网络ResNet的残差块中嵌入压缩−激励(SE)模块,学习各个通道的重要程度和相互联系,增强网络对特征的选择和捕获能力;将残差块中卷积核大小为3×3的标准卷积替换成混合空洞卷积(HDC),在不改变特征图大小、不增加参数计算量的前提下,通过增加卷积核处理数据时各值之间的距离达到增大感受野的目的。实验结果表明:SE−HDC−Mask R−CNN模型可有效提取轨道、电机车、信号灯、行人和石块目标,在井下电机车多场景运行数据集上的平均准确率均值为95.4%,平均掩码分割精度为88.1%,平均边界框交并比为91.7%,相较于Mask R−CNN模型均提升了0.5%,对信号灯、石块(小目标)的检测精度分别提升了0.7%和4.1%;SE−HDC−Mask R−CNN模型的综合性能优于YOLOV2,YOLOV3−Tiny,SSD,Faster R−CNN等模型,可有效解决小目标漏检问题;SE−HDC−Mask R−CNN模型在煤巷直轨、弯轨、黑暗环境、多目标重叠等场景下均可有效实现目标检测,具有一定泛化能力及较高鲁棒性,基本满足无人驾驶电机车障碍物检测需求。Abstract: At present, there are some problems in the identification of stones and other small obstacles in the track during the driving of unmanned underground electric locomotive in coal mines, such as slow detection speed, low detection precision, and easy to cause missing detection and wrong detection for overlapping objects. In order to solve the above problems, a multi-target detection model (SE-HDC-Mask R-CNN) for underground electric locomotive is proposed. The model is improved on the basis of Mask R-CNN. By embedding a squeeze-and-excitation (SE) module in the residual block of the backbone feature extraction network ResNet, the importance and interrelation of each channel are learned. The capability of feature selection and capture of the network is enhanced. The standard convolution with a kernel size of 3×3 in the residual block is replaced with hybrid dilated convolution (HDC). On the premise of not changing the size of the feature image and not increasing the amount of parameter calculation, the receptive field can be increased by increasing the distance between the values when the convolution kernel processes the data. The experimental result show that the SE-HDC-Mask R-CNN model can effectively extract track, electric locomotive, signal light, pedestrian and stone objects. The average precision rate on the multi scene operation data set of underground electric locomotive is 95.4%, the average mask segmentation precision is 88.1%, the average bound box intersection ratio is 91.7%, the three indicators are all improved by 0.5% compared with the Mask R-CNN model. The detection precision of signal light and stone (small objects) is improved by 0.7% and 4.1% respectively. The comprehensive performance of SE-HDC-Mask R-CNN model is better than that of YOLOV2, YOLOV3-Tiny, SSD and Faster R-CNN model. The SE-HDC-Mask R-CNN model can effectively solve the problem of missing detection of small objects. The SE-HDC-Mask R-CNN model can effectively realize object detection in coal roadway straight track, curved track, dark environment, multi-object overlapping and other scenarios. It has certain generalization capability and high robustness, and basically meets the requirements of unmanned electric locomotive obstacle detection.
-
表 1 井下无人驾驶电机车多目标检测实验硬件参数
Table 1. Experimental hardware parameters of multi-object detection of underground unmanned electric locomotive
硬件 参数 系统 Ubuntu18.04 CPU 英特尔 Core i7−8700 @3.02 GHz 六核 GPU Nvidia GeForce GTX1080(8 GB) 内存 16 GB(威士奇DDR3 1 600 MHz) 表 2 ResNet50/101网络下的定性分析
Table 2. Qualitative analysis under ResNet50/101 network
主干特征提取网络 mAP/% mIoUmask/% mIoUbox/% 帧率/(帧·s−1) ResNet50 94.92 87.62 91.20 4.92 ResNet101 96.10 88.30 91.40 3.83 表 3 SE−HDC−Mask R−CNN模型与Mask R−CNN50模型对比结果
Table 3. Comparison results between SE-HDC-Mask R-CNN model and Mask R-CNN50 model
% 目标 AP IoUmask IoUbox Mask
R−CNN50SE−HDC−Mask
R−CNN50Mask
R−CNN50SE−HDC−Mask
R−CNN50Mask
R−CNN50SE−HDC−Mask
R−CNN50轨道 96.3 95.5 87.9 87.8 91.8 91.5 电机车 99.1 99.1 92.6 92.7 93.8 93.8 信号灯 88.9 89.6 85.5 85.1 90.3 90.6 行人 99.2 97.7 88.6 88.2 92.6 92.6 石块 91.1 95.2 83.5 86.5 87.5 89.9 表 4 不同网络模型的评价结果
Table 4. Evaluation results of different network models %
模型 mAP mIoUmask mIoUbox YOLOV2 83.6 − 76.0 YOLOV3-Tiny 92.9 − 81.5 SSD 85.6 − 81.1 Faster R-CNN 85.4 − 88.5 Mask R-CNN50 94.9 87.6 91.2 SE-HDC-Mask R-CNN50 95.4 88.1 91.7 -
[1] 谢和平,任世华,谢亚辰,等. 碳中和目标下煤炭行业发展机遇[J]. 煤炭学报,2021,46(7):2197-2211.XIE Heping,REN Shihua,XIE Yachen,et al. Development opportunities of the coal industry towards the goal of carbon neutrality[J]. Journal of China Coal Society,2021,46(7):2197-2211. [2] 王国法,刘峰,孟祥军,等. 煤矿智能化(初级阶段)研究与实践[J]. 煤炭科学技术,2019,47(8):1-36.WANG Guofa,LIU Feng,MENG Xiangjun,et al. Research and practice on intelligent coal mine construction (primary stage)[J]. Coal Science and Technology,2019,47(8):1-36. [3] 王国法,刘峰,庞义辉,等. 煤矿智能化−煤炭工业高质量发展的核心技术支撑[J]. 煤炭学报,2019,44(2):349-357.WANG Guofa,LIU Feng,PANG Yihui,et al. Coal mine intellectualization:the core technology of high quality development[J]. Journal of China Coal Society,2019,44(2):349-357. [4] 刘峰,曹文君,张建明. 持续推进煤矿智能化 促进我国煤炭工业高质量发展[J]. 中国煤炭,2019,45(12):32-36. doi: 10.3969/j.issn.1006-530X.2019.12.006LIU Feng,CAO Wenjun,ZHANG Jianming. Continuously promoting the coal mine intellectualization and the high-quality development of China's coal industry[J]. China Coal,2019,45(12):32-36. doi: 10.3969/j.issn.1006-530X.2019.12.006 [5] 陈相蒙,王恩标,王刚. 煤矿电机车无人驾驶技术研究[J]. 煤炭科学技术,2020,48(增刊2):159-164.CHEN Xiangmeng,WANG Enbiao,WANG Gang. Research on electric locomotive self-driving technology in coal mine[J]. Coal Science and Technology,2020,48(S2):159-164. [6] 韩江洪,卫星,陆阳,等. 煤矿井下机车无人驾驶系统关键技术[J]. 煤炭学报,2020,45(6):2104-2115.HAN Jianghong,WEI Xing,LU Yang,et al. Driverless technology of underground locomotive in coal mine[J]. Journal of China Coal Society,2020,45(6):2104-2115. [7] 葛世荣. 煤矿机器人现状及发展方向[J]. 中国煤炭,2019,45(7):18-27. doi: 10.3969/j.issn.1006-530X.2019.07.004GE Shirong. Present situation and development direction of coal mine robots[J]. China Coal,2019,45(7):18-27. doi: 10.3969/j.issn.1006-530X.2019.07.004 [8] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016: 779-788. [9] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//European Conference on Computer Vision, 2016: 21-37. [10] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recongnition, 2014: 580-587. [11] GIRSHICK R. Fast R-CNN[C]//IEEE International Conference on Computer Vision, Chile, 2015. [12] REN S,HE K,GIRSHICK R,et al. Faster R-CNN:towards real-time object detection with region proposal netwarks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149. [13] HE K,GKIOXARI G,DOLLÁR P,et al. Mask R-CNN[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,42(2):2980-2988. [14] 王绍清,常方哲,陈昊,等. 高变质煤HRTEM图像中芳香晶格条纹的MASK R-CNN识别[J]. 煤炭学报,2021,46(2):591-601.WANG Shaoqing,CHANG Fangzhe,CHEN Hao,et al. MASK R-CNN identification of aromatic lattice fringes in HRTEM images of high metamorphic coal[J]. Journal of China Coal Society,2021,46(2):591-601. [15] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[C]//IEEE/CVF Conference on Computer Vision and Pattern Recongnition, Salt Lake City, 2018: 7132-7141. [16] 李海燕,吴自莹,郭磊,等. 基于混合空洞卷积网络的多鉴别器图像修复[J]. 华中科技大学学报(自然科学版),2021,49(3):40-45.LI Haiyan,WU Ziying,GUO Lei,et al. Multi-discriminator image inpainting algorithm based on hybrid dilated convolution network[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition),2021,49(3):40-45. [17] HELD D, THRUN S, SAVARESE S. Learning to track at 100 FPS with deep regression networks[C]//Proceedings of the European Conference on Computer Vision Amsterdam, Berlin, 2016: 749-765.