基于改进YOLOv12的煤矸石智能识别方法

An intelligent coal gangue recognition method based on improved YOLOv12

  • 摘要: 针对矿井粉尘浓度大、光照多变等复杂环境因素导致难以准确、高效识别煤矸石的问题,在YOLOv12网络模型基础上进行改进,提出了一种基于改进YOLOv12的煤矸石智能识别方法。通过设计双尺度稀疏注意力(DSSA)机制,提升了模型对多尺度煤矸石目标区域的关注力与空间感知能力;设计了多条件特征精炼(MCFR)机制,对深层和浅层特征进行条件引导融合,有效增强煤炭与煤矸石的类别差异性表达;构建了动态多任务平衡损失(DMTBL)函数,实现位置、类别与置信度之间的权重自适应调节,从而增强模型对难样本区域的学习能力。实验结果表明,改进YOLOv12在煤矸石识别任务中的识别精准率、召回率和mAP分别达96.5%,94.9%,95.8%,较原始YOLOv12分别提升了3.8%,4.5%,4.5%,有效解决了煤矸石漏检、误检和边界模糊等问题,且保持高达47.7帧/s的推理速度。可视化激活热力图表明,改进YOLOv12在处理不同结构与纹理复杂度的煤矸石时,均能准确聚焦于目标本体区域,无明显背景干扰,激活区域基本覆盖了煤块与煤矸石的主要轮廓。

     

    Abstract: To address the difficulty of accurately and efficiently recognizing coal gangue caused by complex environmental factors such as high dust concentration and highly variable illumination in mines, this study improved the YOLOv12 network model and proposed an intelligent coal gangue recognition method based on improved YOLOv12. A Dual-Scale Sparse Attention (DSSA) mechanism was designed to enhance the model's attention to multi-scale coal gangue target regions and its spatial perception capability. A Multi-Condition Feature Refinement (MCFR) mechanism was designed to perform condition-guided fusion of deep and shallow features, which effectively enhanced the discriminative representation between coal and coal gangue. A Dynamic Multi-Task Balance Loss (DMTBL) function was constructed to achieve adaptive weight adjustment among localization, classification, and confidence, thereby strengthening the model's learning capability for hard sample regions. Experimental results showed that the improved YOLOv12 achieved a precision, recall, and mAP of 96.5%, 94.9%, and 95.8%, respectively, in the coal gangue recognition task, representing improvements of 3.8%, 4.5%, and 4.5% over the original YOLOv12, which effectively addressed issues such as missed detection, false positives, and blurred boundaries while maintaining a high inference speed of 47.7 frames per second. Visualization results of activation heatmaps showed that the improved YOLOv12 accurately focused on the target object regions when processing coal gangue with different structures and texture complexities, with no obvious background interference, and the activated regions basically cover the main contours of coal blocks and coal gangue.

     

/

返回文章
返回