Abstract:
Real-time and accurate detection of foreign objects on conveyor belts using deep learning technology is crucial for ensuring the safe and stable operation of belt conveyors. Common YOLO series models struggle to balance lightweight design with detection accuracy, and their high computational complexity and parameter count hinder their deployment on resource-constrained underground edge computing devices. To address this problem, this study proposed an ultra-lightweight model, YOLOv8-PCAS, by applying a lightweight design to the YOLOv8n network. The backbone network of YOLOv8n was replaced with PP-LCNet to create a lightweight backbone. A Context Anchor Attention (CAA) module with an optimized connection structure was introduced into the C2f module to enhance the representation capability for complex shapes of foreign objects. The Average Pooling Down Sampling (ADown) strategy was incorporated to effectively reduce the model size while better preserving key semantic information. Furthermore, a dual detection head structure was designed, which removed the redundant large object detection head to focus on small and medium-sized foreign objects. The YOLOv8-PCAS model was trained and tested using the CUMT-BelT dataset of foreign objects from an underground coal mine and surveillance videos from a coal mine in Shanxi. The experimental results showed that the parameter count of YOLOv8-PCAS was approximately 0.58×10
6 (19.1% of the original YOLOv8n model), with a computational load of 3.6 GFLOPs (44.4% of YOLOv8n). Its lightweight performance surpassed that of mainstream models such as YOLOv7-tiny and YOLOv5n, as well as existing lightweight modifications of YOLOv8n. YOLOv8-PCAS effectively detected targets such as anchor bolts and lump coal on the conveyor belt, achieving an inference speed of 357 frames/s and an average detection time of 2.8 ms. The mean average precision reached 90.5% at an Intersection over Union (IoU) threshold of 0.5. The performance of YOLOv8-PCAS meets the industrial requirements for both detection quality and timeliness.