一种改进的tiny YOLO v3煤矸石快速识别模型

郑道能

doi:10.13272/j.issn.1671-251x.18079

摘要: 传统的煤矸石分选方法效率低下、安全隐患较大、应用范围受限，现有的基于机器视觉的煤矸石图像识别方法在模型识别速度与精度上难以平衡，未综合考虑输入图像尺寸不一、重要通道权重较低及卷积参数量大对模型精度的影响。针对上述问题，在tiny YOLO v3模型的基础上，提出了一种改进的tiny YOLO v3煤矸石快速识别模型。首先，在tiny YOLO v3模型引入多卷积核组合池化的特征金字塔池化（SPP）网络，确保输入特征图可被处理为固定尺寸再输出；其次，引入RGB通道权重可调节的压缩激励（SE）模块，用于增强前几层特征图各通道之间的联系，强调感兴趣通道的特征值和不同目标特征之间的差异性，确保关键信息的捕捉和网络灵敏度；最后，引入包含0权值点的空洞卷积替代tiny YOLO v3模型中部分卷积层，在不增加模型参数的前提下，可捕获多尺度上下文信息进而扩大感受野，提高模型计算速度。将该模型分别与tiny YOLO v3模型、Faster RCNN模型、YOLO v5系列模型进行对比，结果表明：① 与tiny YOLO v3相比，改进的tiny YOLO v3煤矸石快速识别模型的识别准确性和快速性都有显著提升。② 与Faster RCNN相比，改进的tiny YOLO v3煤矸石快速识别模型训练时间减少了65.72%，识别精度增幅为11.83%，识别召回率增幅为0.5%，模型平均精度均值（mAP）增幅为3.02%。③ 与YOLO系列模型相比，改进的tiny YOLO v3煤矸石快速识别模型在保持识别精度优势的情况下识别速度有大幅增长。消融实验结果表明：改进的tiny YOLO v3煤矸石快速识别模型的识别准确率为99.4%，较加入SPP网络的tiny YOLO v3模型的识别准确率提高了4.9%；测试每张图片耗时12.5 ms，较加入SPP网络的tiny YOLO v3模型耗时减少了1 ms。

Abstract: The traditional coal gangue sorting methods have low efficiency, significant safety hazards, and limited application scope. The existing machine vision-based coal gangue image recognition methods are difficult to balance model recognition speed and accuracy. And the methods do not comprehensively consider the impact of different input image sizes, low important channel weights, and large convolution parameters on model precision. In order to solve the above problems, an improved tiny YOLO v3 coal gangue rapid recognition model is proposed based on the tiny YOLO v3 model. Firstly, a spatial pyramid pooling (SPP) network with multiple convolutional kernels combined pooling is introduced in the tiny YOLO v3 model to ensure that the input feature maps can be processed to a fixed size before being output. Secondly, a squeeze-and-excitation (SE) module with adjustable RGB channel weights is introduced to enhance the connections between the channels in the previous layer feature maps. It emphasizes the differences between the feature values of the interested channels and the features of different targets. It ensures the capture of key information and network sensitivity. Finally, the dilated convolution containing zero weight points is introduced to replace part of the convolution layer in the tiny YOLO v3 model. Under the premise of not adding model parameters, multi-scale context information can be captured to expand the receptive field and improve the calculation speed of the model. This model is compared with the tiny YOLO v3 model, Faster RCNN model, and YOLO v5 series models respectively. The results show the following points. ① Compared with tiny YOLO v3, the improved tiny YOLO v3 coal gangue rapid recognition model has significantly improved recognition accuracy and speed. ② Compared with Faster RCNN, the improved tiny YOLO v3 coal gangue rapid recognition model has reduced training time by 65.72%, increased recognition precision by 11.83%, increased recognition recall by 0.5%, and increased model mean average precision (mAP) by 3.02%. ③ Compared with the YOLO series model, the improved tiny YOLO v3 coal gangue rapid recognition model has a significant increase in recognition speed while maintaining the advantage of recognition precision. The results of the ablation experiment show that the improved tiny YOLO v3 coal gangue rapid recognition model has a recognition accuracy of 99.4%. It is 4.9% higher than the tiny YOLO v3 model added with the SPP network. The time to test each image is 12.5 ms, which is 1 ms less than the tiny YOLO v3 model added to the SPP network.

一种改进的tiny YOLO v3煤矸石快速识别模型

An improved tiny YOLO v3 rapid recognition model for coal-gangue