一种针对煤炭颗粒图像的双阶段自适应分割框架

A dual-stage adaptive segmentation framework for coal particle images

  • 摘要: 在实际生产场景中,煤炭颗粒不规则的几何形态和复杂的空间分布不仅影响分割精度,也使得人工标注分割掩码极为不便,难以适用于大规模工业场景。针对该问题,提出了一种针对煤炭颗粒图像的双阶段自适应分割框架DASeg。该框架由DS−YOLO目标检测模型、自适应边界框校正(ABR)模块和SAM2图像分割模型组成。DS−YOLO模型在YOLOv11颈部网络中引入动态上采样模块DySample和空间与通道协同注意力(SCSA)模块,有效提高了目标检测精度。针对DS−YOLO生成的检测框难以贴合实际煤粒边界的问题,设计了ABR模块,ABR模块根据加权系数对原始检测框与掩码外接框进行加权融合,生成更准确的提示框。将修正后的坐标信息作为SAM2模型的提示输入,利用SAM2提取全局与局部特征,并融合提示区域信息生成目标掩膜,实现煤粒分割。实验结果表明,DASeg分割框架在煤炭颗粒图像分割任务中表现优异,其中像素准确率(PA)达到93.1%,平均交并比(mIoU)为88.4%,平均Dice系数(mDice)为93.4%。

     

    Abstract: In practical production scenarios, the irregular geometric morphology and complex spatial distribution of coal particles not only affect segmentation accuracy but also make manual annotation of segmentation masks extremely inconvenient, limiting their applicability to large-scale industrial scenarios. To address this problem, a dual-stage adaptive segmentation framework (DASeg) for coal particle images was proposed. The framework consisted of the DS-YOLO object detection model, the Adaptive Box Refinement (ABR) module, and the SAM2 image segmentation model. The DS-YOLO model introduced the Dynamic Upsampling (DySample) module and the Spatial and Channel Synergistic Attention (SCSA) module into the neck network of YOLOv11, which effectively improved object detection accuracy. To solve the problem that the detection boxes generated by DS-YOLO did not closely fit the actual coal particle boundaries, the ABR module was designed. The ABR module performed weighted fusion of the original detection boxes and the bounding boxes of the masks according to weighting coefficients to generate more accurate prompt boxes. The corrected coordinate information was then used as prompt input for the SAM2 model, which extracted global and local features and fused prompt region information to generate target masks, thereby achieving coal particle segmentation. Experimental results showed that the DASeg framework performed excellently in coal particle image segmentation tasks, with a Pixel Accuracy (PA) of 93.1%, a Mean Intersection Over Union (mIoU) of 88.4%, and a Mean Dice (mDice) of 93.4%.

     

/

返回文章
返回