基于MES−YOLOv5s的综采工作面大块煤检测算法

Large block coal detection algorithm for fully mechanized working face based on MES-YOLOv5s

  • 摘要: 综采工作面的目标具有高速运动、多尺度、遮挡等特点,现有的目标检测算法存在精度低、模型占用的内存大、硬件依赖强等问题。针对上述问题,提出了一种基于MES−YOLOv5s的综采工作面大块煤检测算法。采用轻量化设计,将MobileNetV3作为主干网络,以减小模型占用的内存,提高CPU端的检测速度;在颈部网络添加高效多尺度注意力(EMA)模块,融合不同尺度的上下文信息,并进一步减少计算开销;采用SIoU损失函数代替CIoU损失函数,以提高训练速度和推理准确性。消融实验结果表明:MobileNetV3大幅减少了模型占用的内存和检测时间,但mAP损失严重;EMA模块和SIoU损失函数可在一定程度上恢复损失的精度,同时保证模型在CPU上具有较高的检测速度,满足煤矿井下目标实时检测需求。对比实验结果表明,与DETR,YOLOv5n,YOLOv5s,YOLOv7模型相比,MES−YOLOv5s模型综合性能最好,mAP为84.6%,模型占用的内存为11.2 MiB,在CPU端的检测时间为31.8 ms,在高速运动、多尺度、遮挡和多目标的工况环境下能够保持较高的召回率和精度。

     

    Abstract: The objects in the fully mechanized working face have the features of high-speed motion, multi-scale, occlusion, etc. The existing object detection algorithms have problems such as low precision, large memory of models, and strong hardware dependence. In order to solve the above problems, a large block coal detection algorithm based on MES-YOLOv5s is proposed in fully mechanized working face. The method adopts a lightweight design, uses MobileNetV3 as the backbone network to reduce the memory occupied by the model and improve the detection speed on the CPU side. The method adds an efficient multi-scale attention (EMA) module to the neck network, fuses contextual information of different scales, and further reduces computational overhead. The method uses SIoU loss function instead of CIoU loss function to improve training speed and inference accuracy. The ablation experiment results show that MobileNetV3 significantly reduces the memory and detection time occupied by the model, but the mAP loss is severe. The EMA module and SIoU loss function can restore the precision of the loss to a certain extent, while ensuring that the model has a high detection speed on the CPU, meeting the real-time detection needs of coal mine underground objects. The comparative experimental results show that compared with DETR, YOLOv5n, YOLOv5s, and YOLOv7 models, the MES-YOLOv5s model has the best overall performance, with an mAP of 84.6%. The model occupies 11.2 MiB of memory and has a detection time of 31.8 ms on the CPU side. It can maintain high recall and precision in high-speed motion, multi-scale, occlusion, and multi-object working environments.

     

/

返回文章
返回