基于改进YOLOv11的露天矿复杂背景下小目标检测

Small object detection in complex open-pit mine backgrounds based on improved YOLOv11

  • 摘要: 露天矿小目标检测任务面临视角广、检测距离远导致目标成像小的挑战,现有目标检测模型存在图像逐层下采样操作引发的特征衰减问题。针对该问题,提出了一种改进YOLOv11模型,并将其用于露天矿复杂背景下小目标检测。改进YOLOv11模型通过引入鲁棒特征下采样(RFD)模块替换跨步卷积下采样模块,有效保留了小目标的特征信息;设计了小目标特征增强颈部(STFEN)网络替代原有特征金字塔结构的颈部网络,在模型颈部引入跨阶段部分融合模块,整合来自不同层级的特征图;将原有的CIoU损失函数替换为Powerful−IoU(PIoU)损失函数,解决了训练过程中锚框膨胀问题,使模型快速精准聚焦小目标。在露天矿区小目标数据集上的实验结果表明:① RFD模块使模型参数量减少的同时mAP提升了1.5%;STFEN网络虽使模型参数量有所增加,但mAP提升了2.2%;PIoU损失函数在未改变模型参数量及每秒浮点运算次数的前提下使mAP提升了1.7%;三者联合应用最终使模型mAP提升了3.9%。② 改进YOLO11模型在保持较高推理速度的同时实现了精度提升,其mAP较YOLOv5m,YOLOv8m,YOLOv11m和RtDetr−L分别提高了2.6%,1.5%,0.9%和2.2%,且模型参数量更小,易于边缘部署。

     

    Abstract: Small object detection in open-pit mines faces challenges such as wide viewing angles and long detection distances, which result in small target imaging. Existing object detection models suffer from feature attenuation caused by progressive image downsampling operations. To address this issue, an improved YOLOv11 model was proposed and applied to small object detection under complex backgrounds in open-pit mines. The improved YOLOv11 model introduced a Robust Feature Downsampling (RFD) module to replace the stride convolution downsampling module, effectively preserving the feature information of small objects. A Small Target Feature Enhancement Neck (STFEN) network was designed to replace the original feature pyramid structure in the neck, incorporating a cross-stage partial fusion module to integrate feature maps from different levels. The original CIoU loss function was replaced with the Powerful-IoU (PIoU) loss function to solve the anchor box expansion issue during training, enabling the model to rapidly and accurately focus on small targets. Experimental results on a small object dataset from open-pit mining areas showed that: ① the RFD module reduced model parameters while increasing mAP by 1.5%. Although the STFEN network increased the number of parameters, it improved mAP by 2.2%. The PIoU loss function improved mAP by 1.7% without changing the number of parameters or FLOPs. The combination of all three led to a total mAP improvement of 3.9%. ② The improved YOLOv11 model achieved higher accuracy while maintaining a high inference speed, with mAP improvements of 2.6%, 1.5%, 0.9%, and 2.2% over YOLOv5m, YOLOv8m, YOLOv11m, and RtDetr-L, respectively, and with fewer parameters, making it more suitable for edge deployment.

     

/

返回文章
返回