Coal mine image instance segmentation method based on improved SOLOv2
-
摘要: 现有的图像分割方法用于清晰度较好的煤矿井下图像时效果良好,但应用于环境复杂的煤矿井下时,获取的图像大多较模糊且目标物体轮廓不清晰,从而影响目标物体的分割精度。针对上述问题,提出了一种基于改进SOLOv2的煤矿图像实例分割方法。将SOLOv2模型的ResNet−50网络替换为ResNeXt−18网络,从而精简网络层数,提升模型的推理速度;引入坐标注意力(CA)模块,以提升模型特征提取能力,保留精确的位置信息,提高模型的图像分割精度;采用ACON−C激活函数替换ReLU激活函数,从而使神经元之间的特征得以充分组合,增强模型的特征表达能力,进一步提高模型的图像分割精度。将改进SOLOv2模型部署在嵌入式平台上进行煤矿图像分割实验,相较于SOLOv2模型,改进SOLOv2模型的Mask AP(掩膜平均精度)提高了1.1%,模型权重文件减小了83.2 MiB,推理速度提高了5.30帧/s,达26.10 帧/s,在煤矿图像分割精度和推理速度上均有一定提升。Abstract: The existing image segmentation methods have good results when used for coal mine underground images with good clarity. But when the methods are applied to coal mine underground images with complex environments, the obtained images are mostly blurry and the contour of the target object is not clear. The result affects the segmentation precision of the target object. In order to solve the above problems, a coal mine image instance segmentation method based on improved SOLOv2 is proposed. The method replaces the ResNet-50 network of the SOLOv2 model with the ResNeXt-18 network to simplify the network layers and improve the inference speed of the model. The method introduces the coordinate attention (CA) module to enhance the model's feature extraction capability, retain precise positional information, and improve the model's image segmentation precision. The method replaces the ReLU activation function with the ACON-C activation function. The features between neurons can be fully combined, enhancing the model's feature expression capability, and further improving the image segmentation precision of the model. The improved SOLOv2 model is deployed on an embedded platform for coal mine image segmentation experiments. Compared to the SOLOv2 model, the Mask AP (mask average precision) of the improved SOLOv2 model increases by 1.1%, the weight file of the model decreases by 83.2 MiB. The inference speed increases by 5.30 frames/s, reaching 26.10 frames/s. Both the precision and inference speed of coal mine image segmentation are improved to a certain extent.
-
表 1 ResNeXt−18网络结构参数
Table 1. Parameters of ResNeXt-18 network structure
类型 滤波器
数量输出
大小ResNeXt−18
(16×2d)标准卷积层 64 240×240 7×7,64,步长=2 最大池化层 32 120×120 3×3,32,步长=2 组合模块1 64 120×120 $ \left[ \begin{array}{c}3\times3,32,A=16 \\ 3\times\mathrm{3,64}\end{array} \right]\times2 $ 组合模块2 128 60×60 $ \left[ \begin{array}{c}3\times3,64,A=16 \\ 3\times\mathrm{3,128}\end{array} \right]\times2 $ 组合模块3 256 30×30 $ \left[ \begin{array}{c}3\times3,128,A=16 \\ 3\times\mathrm{3,256}\end{array} \right]\times2 $ 组合模块4 512 15×15 $ \left[ \begin{array}{c}3\times3,256,A=16 \\ 3\times\mathrm{3,512}\end{array} \right]\times2 $ 表 2 在煤矿图像数据集上的改进SOLOv2模型的消融实验结果
Table 2. Ablation experiment results of improved SOLOv2 model on coal mine image dataset
模型 主干特征提取网络 特征金字塔网络 激活函数 权重文件大小/MiB Mask AP/% 帧速率/(帧·s−1) SOLOv2 ResNet−50 FPN ReLU 384.7 0.983 20.80 改进模型1 ResNeXt−18 FPN ReLU 293.0 0.984 26.39 改进模型2 ResNeXt−18_CA FPN ReLU 301.4 0.986 26.10 改进模型3 ResNeXt−18 FPN ACON−C 293.0 0.988 26.37 改进SOLOv2 ResNeXt−18_CA FPN ACON−C 301.5 0.994 26.10 表 3 不同网络模型实验结果比较
Table 3. Comparison of experimental results of different network models
模型 权重文件
大小/MiBMask AP/% 帧速率/(帧·s−1) Mask R−CNN 351.3 0.967 13.90 SOLOv2 338.1 0.986 18.10 改进SOLOv2 301.5 0.994 26.10 -
[1] LIN Kunqi,HUANG Wenhui,FINKELMAN R B,et al. Distribution,modes of occurrence,and main factors influencing lead enrichment in Chinese coals[J]. International Journal of Coal Science & Technology,2020,7(1):1-18. [2] JU Yang,ZHU Yan,XIE Heping,et al. Fluidized mining and insitu transformation of deep underground coal resources:a novel approach to ensuring safe,environmentally friendly,low-carbon,and clean utilisation[J]. International Journal of Coal Science & Technology,2019,6(2):184-196. [3] LONG J,SHELHAMER E,DARRELL T. Fully convolutional networks for semantic segmentation[C]. IEEE Conference on Computer Vision and Pattern Recognition,Boston,2015:3431-3440. [4] REN Shaoqing,HE Kaiming,GIRSHICK R,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149. [5] LIU Shu,QI Lu,QIN Haifang,et al. Path aggregation network for instance segmentation[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,2018:8759-8768. [6] HE Kaiming,GKIOXARI G,DOLLAR P,et al. Mask R-CNN[C]. IEEE International Conference on Computer Vision,Venice,2017:2980-2988. [7] BOLYA D,ZHOU Chong,XIAO Fanyi,et al. YOLACT:real-time instance segmentation[C]. IEEE/CVF International Conference on Computer Vision,Seoul,2019:9156-9165. [8] BAI Min,URTASUN R. Deep watershed transform for instance segmentation[C]. IEEE Conference on Computer Vision and Pattern Recognition,Honolulu,2017:2858-2866. [9] DAI Jifeng,HE Kaiming,LI Yi,et al. Instance-sensitive fully convolutional networks[C]. European Conference on Computer Vision,Amsterdam,2016:534-549. [10] CHEN Xinlei,GIRSHICK R,HE Kaiming,et al. TensorMask:a foundation for dense object segmentation[C]. IEEE/CVF International Conference on Computer Vision,Seoul,2019:2061-2069. [11] XIE Enze,SUN Peize,SONG Xiaoge,et al. PolarMask:single shot instance segmentation with polar representation[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Seattle,2020:12190-12199. [12] WANG Xinlong,KONG Tao,SHEN Chunhua,et al. SOLO:segmenting objects by locations[C]. European Conference on Computer Vision,Glasgow,2020:649-665. [13] 李明,鹿朋,朱美强,等. 基于改进YOLO−tiny的闸板阀开度检测[J]. 煤炭学报,2021,46(增刊2):1180-1190. doi: 10.13225/j.cnki.jccs.2021.0200LI Ming,LU Peng,ZHU Meiqiang,et al. Opening degree detection of gate valve based on improved YOLO-tiny[J]. Journal of China Coal Society,2021,46(S2):1180-1190. doi: 10.13225/j.cnki.jccs.2021.0200 [14] 赵小虎,车亭雨,叶圣,等. 煤体红外热像异常区域分割方法[J]. 工矿自动化,2022,48(9):92-99.ZHAO Xiaohu,CHE Tingyu,YE Sheng,et al. Segmentation method of the abnormal area of coal infrared thermal image[J]. Journal of Mine Automation,2022,48(9):92-99. [15] 冯文彬,厉舒南,田昊,等. 基于融合边缘优化的煤矿图像语义分割方法[J]. 煤矿安全,2022,53(2):136-141. doi: 10.13347/j.cnki.mkaq.2022.02.022FENG Wenbin,LI Shunan,TIAN Hao,et al. Images semantic segmentation method based on fusion edge optimization[J]. Safety in Coal Mines,2022,53(2):136-141. doi: 10.13347/j.cnki.mkaq.2022.02.022 [16] 杨潇,陈伟,任鹏,等. 基于域适应的煤矿环境监控图像语义分割[J]. 煤炭学报,2021,46(10):3386-3396. doi: 10.13225/j.cnki.jccs.2020.1771YANG Xiao,CHEN Wei,REN Peng,et al. Coal mine monitoring image semantic segmentation based on domain adaptation[J]. Journal of China Coal Society,2021,46(10):3386-3396. doi: 10.13225/j.cnki.jccs.2020.1771 [17] 左纯子,王征,张科,等. 基于改进DeepLabV3+的煤尘图像分割方法[J]. 工矿自动化,2022,48(5):52-57,64.ZUO Chunzi,WANG Zheng,ZHANG Ke,et al. Coal dust image segmentation method based on improved DeepLabV3+[J]. Journal of Mine Automation,2022,48(5):52-57,64. [18] 司垒,王忠宾,熊祥祥,等. 基于改进U−net网络模型的综采工作面煤岩识别方法[J]. 煤炭学报,2021,46(增刊1):578-589. doi: 10.13225/j.cnki.jccs.2020.1011SI Lei,WANG Zhongbin,XIONG Xiangxiang,et al. Coal-rock recognition method of fully-mechanized coal mining face based on improved U-net network model[J]. Journal of China Coal Society,2021,46(S1):578-589. doi: 10.13225/j.cnki.jccs.2020.1011 [19] WANG Xinlong,ZHANG Rufeng,KONG Tao,et al. SOLOv2:dynamic and fast instance segmentation[EB/OL]. [2023-02-20]. https://arxiv.org/abs/2003.10152. [20] HOU Qibin,ZHOU Daquan,FENG Jiashi. Coordinate attention for efficient mobile network design[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Nashville,2021:13708-13717. [21] MA Ningning,ZHANG Xiangyu,LIU Ming,et al. Activate or not:learning customized activation[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:8028-8038.