Abstract:
In response to the complex conditions of uneven lighting, inconsistent target scales, and occlusions in underground coal mine monitoring videos, a multi-target detection algorithm based on YOLOv8n-ASAM is proposed. First, the Neck layer of YOLOv8n is improved by adding a MultiSEAM attention mechanism to enhance the detection performance for occluded targets. Secondly, the MLCA attention mechanism is introduced into the C2f module, constructing the C2f-MLCA module to fuse local and global feature information, thereby improving feature expression capability. Additionally, an Adaptive Spatial Feature Fusion (ASFF) module is embedded in the detection head of the Head layer to enhance the detection performance for small-scale targets. Finally, the YOLOv8n-ASAM model is validated on a coal mine underground production monitoring video dataset. Experimental results show that compared to the Faster R-CNN, SSD, RT-DETR, YOLOv5s, YOLOv7, YOLOv8n, and YOLOv8s models, the YOLOv8n-ASAM model achieves better performance in accuracy, mAP50, and mAP50-95 metrics. Specifically, compared to the YOLOv8n model, the YOLOv8n-ASAM model improves accuracy, mAP50, and mAP50-95 by 5.3%, 1.8%, and 1.1%, respectively. The YOLOv8n-ASAM model demonstrates good robustness, exhibiting excellent detection results for small-scale and occluded targets, and achieving higher detection confidence scores in conditions of uneven lighting and sparse multi-target distribution.