LI Yuancheng, HOU Yalei, REN Yiming, et al. Lightweight multi-scale object detection model for autonomous mining trucks in open-pit minesJ. Journal of Mine Automation,2026,52(3):95-101, 132. DOI: 10.13272/j.issn.1671-251x.2025090133
Citation: LI Yuancheng, HOU Yalei, REN Yiming, et al. Lightweight multi-scale object detection model for autonomous mining trucks in open-pit minesJ. Journal of Mine Automation,2026,52(3):95-101, 132. DOI: 10.13272/j.issn.1671-251x.2025090133

Lightweight multi-scale object detection model for autonomous mining trucks in open-pit mines

  • In the scenario of autonomous mining trucks in open-pit mines, low illumination and high dust conditions lead to low detection accuracy when multi-scale targets coexist, and the large parameter scale of models makes it difficult to achieve an effective balance between detection accuracy and lightweight deployment. To address these issues, a lightweight multi-scale object detection model for autonomous mining trucks in open-pit mines was proposed, referred to as the improved YOLOv11n model. In the shallow C3k2 modules of the backbone network, a Mixed Token (MToken) was introduced, and parallel multi-dilation-rate convolution branches were used to enhance feature extraction capability for multi-scale targets. In the deep C3k2 modules of the backbone network, a Multiple Look-Up Tables (MuLUT) was introduced, and deep semantic feature modeling was used to enhance discrimination capability for multi-scale targets. An Intensity Lighten Self-Attention (ILSA) module was used to replace the C3k2 modules, improving feature representation quality under complex low-illumination conditions. In the Pyramid Sparse Transformer (PST) module, an adaptive Top-k selection strategy was introduced to replace the C3k2 modules in the original neck feature pyramid structure, and cross-scale feature enhancement was used to improve the capture capability for multi-scale targets. Experimental results showed that: ① compared with the YOLOv11n model, the improved YOLOv11n model achieved a 3.2% increase in mAP@0.5, while the number of parameters, computational cost, and model size were reduced by 26.7%, 30.2%, and 21.8%, respectively. ② Compared with SSD, Faster R-CNN, YOLOv11n, YOLOv12n, and YOLOv13n, the improved YOLOv11n model achieved the best mAP@0.5, and had the smallest number of parameters, computational cost, and model size, making it suitable for edge deployment. ③ When deployed on edge devices, the improved YOLOv11n model could accurately detect vehicles and pedestrians, achieving an inference speed of 27.6 frames/s and a model size of 2.673 MiB, demonstrating excellent real-time performance and deployment efficiency.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return