Abstract:
To address the challenges in underground unmanned locomotive image feature extraction—such as poor lighting, high noise, and motion blur, which result in the loss of image details and difficulty in identifying small targets—a multi-object detection model for underground unmanned locomotives based on DYCS-YOLOv8n was proposed. Based on YOLOv8n, the Convolutional Block Attention Module (CBAM) was introduced, enhancing the extraction of key features through spatial and channel attention mechanisms. A small-object detection layer was added, increasing the original three layers to four, thereby improving the extraction of fine features and enhancing detection performance for small-sized targets. The dynamic upsampling operator DySample was employed to adaptively adjust the sampling strategy according to the input features, better preserving edges and local details in the images and avoiding the loss of critical information. Experiments conducted on a self-constructed underground unmanned locomotive dataset showed that: ① The DYCS-YOLOv8n model achieved a mean Average Precision (mAP@0.5) of 97.5%, an improvement of 3.4% over the YOLOv8n model, with a detection speed of 46.35 frames per second, meeting the requirements for real-time detection. ② Compared with mainstream YOLO series object detection models, DYCS-YOLOv8n achieved the optimal mAP@0.5, maintaining a lightweight structure while ensuring high computational speed. ③ In complex underground scenarios with noise and low illumination, the DYCS-YOLOv8n model exhibited high average detection confidence for pedestrians, tracks, and signal lights, with no cases of missed or false detections.