A small object detection method for coal mine underground scene based on YOLOv7-SE
-
摘要: 目前的小目标检测方法虽然提高了小目标检测效果,但针对的多为常规场景,而煤矿井下环境恶劣,在井下小目标检测过程中存在小目标特征信息提取困难的问题。针对上述问题,提出了一种基于YOLOv7−SE的煤矿井下场景小目标检测方法。首先,将模拟退火(SA)算法与k−means++聚类算法融合,通过优化YOLOv7模型中初始锚框值的估计,准确捕捉井下小目标;然后,在YOLOv7骨干网络中增加新的检测层得到井下小目标高分辨率特征图,减少大量煤尘对井下小目标特征表示的干扰;最后,在骨干网络中的聚合网络模块后引入双层注意力机制,强化井下小目标的特征表示。实验结果表明:① YOLOv7−SE网络模型训练后的损失函数值稳定在0.05附近,说明YOLOv7−SE网络模型参数设置合理。② 基于YOLOv7−SE网络模型的安全帽检测平均精度(AP)较Faster R−CNN,RetinaNet,CenterNet,FCOS,SSD,YOLOv5,YOLOv7分别提升了13.86%,25.3%,16.13%,12.71%,15.53%,11.59%,12.20%。基于YOLOv7−SE网络模型的自救器检测AP较Faster R−CNN,RetinaNet,CenterNet,FCOS,SSD,YOLOv5,YOLOv7分别提升了12.37%,20.16%,15.22%,8.35%,19.42%,9.64%,7.38%。YOLOv7−SE网络模型的每秒传输帧数(FPS)较Faster R−CNN,RetinaNe,CenterNet,FCOS,SSD,YOLOv5分别提升了42.56,44.43,31.74,39.84,22.74,23.34帧/s,较YOLOv7下降了9.36帧/s。说明YOLOv7−SE网络模型保证检测速度的同时,有效强化了YOLOv7−SE网络模型对井下小目标的特征提取能力。③ 在对安全帽和自救器的检测中,YOLOv7−SE网络模型有效改善了漏检和误检问题,提高了检测精度。Abstract: Although current small object detection methods have improved the detection performance, they are mostly objected at conventional scenarios. In harsh underground environments in coal mines, there are difficulties in extracting small object feature information during the underground small object detection process. In order to solve the problem. a small object detection method for coal mine underground scenes based on YOLOv7-SE has been proposed. Firstly, the simulated annealing (SA) algorithm is integrated with the k-means++clustering algorithm to accurately capture small underground objects by optimizing the estimation of initial anchor box values in the YOLOv7 model. Secondly, a new detection layer is added to the YOLOv7 backbone network to obtain high-resolution feature maps of underground small objects, reducing the interference of a large amount of coal dust on the feature representation of underground small objects. Finally, a dual layer attention mechanism is introduced after the aggregation network module in the backbone network to enhance the feature representation of small underground objects. The experimental results show the following points. ① The loss function of the YOLOv7-SE network model after training is stable around 0.05, indicating that the parameter settings of the YOLOv7-SE network model are reasonable. ② The average precision (AP) of helmet detection based on the YOLOv7-SE network model has improved by 13.86%, 25.3%, 16.13%, 12.71%, 15.53%, 11.59% and 12.20% compared to Faster R-CNN, RetinaNet, CenterNet, FCOS, SSD, YOLOv5 and YOLOv7, respectively. The self rescue device detection AP based on the YOLOv7-SE network model has improved by 12.37%, 20.16%, 15.22%, 8.35%, 19.42%, 9.64% and 7.38% compared to Faster R-CNN, RetinaNet, CenterNet, FCOS, SSD, YOLOv5 and YOLOv7, respectively.The frames per second (FPS) of the YOLOv7-SE network model has increased by 42.56, 44.43, 31.74, 39.84, 22.74 and 23.34 frames/s compared to Faster R-CNN, RetinaNe, CenterNet, FCOS, SSD and YOLOv5, respectively, and decreased by 9.36 frames/s compared to YOLOv7. The YOLOv7-SE network model effectively enhances the feature extraction capability of the YOLOv7-SE network model for small underground objects while ensuring detection speed. ③ In the detection of safety helmets and self rescue devices, the YOLOv7-SE network model effectively improves missed and false detection, and improves detection precision.
-
表 1 实验环境配置
Table 1. Experimental environment configuration
实验环境 配置 GPU RTX 3090(24 GiB) CPU 12 vCPU Xeon(R) Platinum 8255C 操作系统 ubuntu20.04 GPU环境 CUDA11.3 cuDNN8.2.1 深度学习框架 Pytorch1.11 编译器 Python3.8 表 2 各模型对比结果
Table 2. Comparison results of each model
模型 AP/% mAP/% FPS/(帧·s−1) 安全帽 自救器 Faster R−CNN 58.64 52.11 55.38 19.28 RetinaNet 47.20 44.32 45.76 17.41 CenterNet 56.37 49.26 52.82 30.10 FCOS 59.79 56.13 57.96 22.00 SSD 56.97 45.06 51.02 39.10 YOLOv5 60.91 54.84 57.88 38.50 YOLOv7 60.30 57.10 58.70 71.20 YOLOv7−SE 72.50 64.48 68.49 61.84 表 3 消融实验结果
Table 3. Results of ablation experiment
模型 AP/% mAP/% FPS/(帧·s−1) 安全帽 自救器 YOLOv7 60.30 57.10 58.70 71.20 YOLOv7+改进的k−means++ 63.21 60.70 61.95 74.20 YOLOv7+改进骨干网络 70.70 62.32 66.51 63.18 YOLOv7−SE 72.50 64.48 68.49 61.84 -
[1] 许鹏飞. 2000−2021年我国煤矿事故特征及发生规律研究[J]. 煤炭工程,2022,54(7):129-133.XU Pengfei. Characteristics and occurrence regularity of coal mine accidents in China from 2020 to 2021[J]. Coal Engineering,2022,54(7):129-133. [2] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision,2004,60(2):91_110. [3] DALAL N,TRIGGS B. Histograms of oriented gradients for human detection[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition,San Diego,2005. DOI: 10.1109/CVPR.2005.177. [4] PLATT J. Sequential minimal optimization:a fast algorithm for training support vector machines[EB/OL]. (1998-04-21) [2023-07-05]. https://api.semanticscholar.org/CorpusID:577580. [5] 杨锋,丁之桐,邢蒙蒙,等. 深度学习的目标检测算法改进综述[J]. 计算机工程与应用,2023,59(11):1_15.YANG Feng,DING Zhitong,XING Mengmeng,et al. Review of object detection algorithm improvement in deep learning[J]. Computer Engineering and Applications,2023,59(11):1_15. [6] GIRSHICK R,DONAHUE J,DARRELL T,et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. IEEE Conference on Computer Vision and Pattern Recognition,Columbus,2014:580_587. [7] GIRSHICK R. Fast R-CNN[C]. IEEE International Conference on Computer Vision,Santiago,2015:1440_1448. [8] REN Shaoqing,HE Kaiming,GIRSHICK R,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. [9] LIU Wei,ANGUELOV D,ERHAN D,et al. SSD:single shot multiBox detector[C]. The14th European Conference on Computer Vision,Amsterdam,2016:21_37. [10] REDMON J,DIVVALA S,GIRSHICK R,et al. You only look once:unified,real-time object detection[C]. IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,2016:779_788. [11] REDMON J,FARHADI A. YOLO9000:better,faster,stronger[C]. IEEE Conference on Computer Vision and Pattern Recognition,Honolulu,2017:7263_7271. [12] REDMON J,FARHADI A. YOLOv3:an incremental improvement[EB/OL]. (2018-04-08) [2023-07-05]. https://arxiv.org/abs/1804.02767. [13] BOCHKOVSKIY A,WANG C Y,LIAO H. YOLOv4:optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2023-07-05]. https://arxiv.org/abs/2004.10934. [14] GE Z,LIU S,WANG F,et al. YOLOX:exceeding yolo series in 2021[EB/OL]. (2021-07-18) [2023-07-05]. https://arxiv.org/abs/2107.08430. [15] SUN Wei,DAI Liang,ZHANG Xiaorui,et al. RSOD:real-time small object detection algorithm in uav-based traffic monitoring[J]. Applied Intelligence,2022,52(8). DOI: 10.1007/S10489-021-02893-3. [16] 翟国栋,任聪,王帅,等. 多尺度特征融合的煤矿救援机器人目标检测模型[J]. 工矿自动化,2020,46(11):54_58.ZHAI Guodong,REN Cong,WANG Shuai,et al. Object detection model of coal mine rescue robot based on multi-scale feature fusion[J]. Industry and Mine Automation,2020,46(11):54_58. [17] 沈科,季亮,张袁浩,等. 基于改进YOLOv5s模型的煤矸目标检测[J]. 工矿自动化,2021,47(11):107_111,118.SHEN Ke,JI Liang,ZHANG Yuanhao,et al. Research on coal and gangue detection algorithm based on improved YOLOv5s model[J]. Industry and Mine Automation,2021,47(11):107_111,118. [18] ZHANG Dawei,ZHENG Zhonglong,LI Minglu,et al. CSART:channel and spatial attention-guided residual learning for real-time object tracking[J]. Neurocomputing,2021,436(14):260_272. [19] ZHANG Dawei,ZHENG Zhonglong,WANG Tianxiang,et al. HROM:learning high-resolution representation and object-aware masks for visual object tracking[J]. Sensors,2020,20(17). DOI: 10.3390/s20174807. [20] ZHU Lei,WANG Xinjiang,KE Zhanghan,et al. BiFormer:vision transformer with bi-level routing attention[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Vancouver,2023:10323_10333. [21] WANG Chenyao,BOCHKOVSKIY A,LIAO H. YOLOv7:trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Vancouver,2023:7464_7475. [22] 曹鹏. 基于模拟退火算法的大螺旋钻机自主钻进控制系统研究[J]. 煤矿机械,2023,44(10):194_196.CAO Peng. Research on autonomous drilling control system of large screw drilling rig based on simulated annealing algorithm[J]. Coal Mine Machinery,2023,44(10):194_196. [23] HE Kaiming,SUN Jian,TANG Xiao'ou. Single image haze removal using dark channel prior[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence,2011,33(12):2341_2353.