留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

融合坐标注意力与多尺度特征的轻量级安全帽佩戴检测

李忠飞 冯仕咏 郭骏 张云鹤 徐飞翔

李忠飞,冯仕咏,郭骏,等. 融合坐标注意力与多尺度特征的轻量级安全帽佩戴检测[J]. 工矿自动化,2023,49(11):151-159.  doi: 10.13272/j.issn.1671-251x.2023080123
引用本文: 李忠飞,冯仕咏,郭骏,等. 融合坐标注意力与多尺度特征的轻量级安全帽佩戴检测[J]. 工矿自动化,2023,49(11):151-159.  doi: 10.13272/j.issn.1671-251x.2023080123
LI Zhongfei, FENG Shiyong, GUO Jun, et al. Lightweight safety helmet wearing detection fusing coordinate attention and multiscale feature[J]. Journal of Mine Automation,2023,49(11):151-159.  doi: 10.13272/j.issn.1671-251x.2023080123
Citation: LI Zhongfei, FENG Shiyong, GUO Jun, et al. Lightweight safety helmet wearing detection fusing coordinate attention and multiscale feature[J]. Journal of Mine Automation,2023,49(11):151-159.  doi: 10.13272/j.issn.1671-251x.2023080123

融合坐标注意力与多尺度特征的轻量级安全帽佩戴检测

doi: 10.13272/j.issn.1671-251x.2023080123
基金项目: 国家重点研发计划项目(2021YFC2902702)。
详细信息
    作者简介:

    李忠飞(1981—),男,内蒙古通辽人,高级工程师,硕士,现从事矿山机电一体化、信息化与智能化方面的研究工作,E-mail:zhongfei_li@sohu.com

  • 中图分类号: TD67

Lightweight safety helmet wearing detection fusing coordinate attention and multiscale feature

  • 摘要: 针对现有煤矿工人安全帽佩戴检测算法存在检测精度与速度难以取得较好平衡的问题,以YOLOv4模型为基础,提出了一种融合坐标注意力与多尺度的轻量级模型M−YOLO,并将其用于安全帽佩戴检测。该模型使用融入混洗坐标注意力模块的轻量化特征提取网络S−MobileNetV2替换YOLOv4的特征提取网络CSPDarknet53,在减少相关参数量的前提下,有效改善了特征之间的联系;将原有空间金字塔池化结构中的并行连接方式改为串行连接,有效提高了计算效率;对特征融合网络进行改进,引入具有高分辨率、多细节纹理信息的浅层特征,以有效加强对检测目标特征的提取,并将原有Neck结构中的部分卷积修改为深度可分离卷积,在保证检测精度的前提下进一步降低了模型的参数量和计算量。实验结果表明,与YOLOv4模型相比,M−YOLO模型的平均精度均值仅降低了0.84%,但计算量、参数量、模型大小分别减小了74.5%,72.8%,81.6%,检测速度提高了53.4%;相较于其他模型,M−YOLO模型在准确率和实时性方面取得了良好的平衡,满足在智能视频监控终端上嵌入式加载和部署的需求。

     

  • 图  1  M−YOLO结构

    Figure  1.  M-YOLO structure

    图  2  坐标注意力模块结构

    Figure  2.  Coordinate attention module structure

    图  3  SCA模块结构

    Figure  3.  Shuffle coordinate attention module structure

    图  4  SPP结构

    Figure  4.  Spatial pyramid pooling structure

    图  5  SPPF结构

    Figure  5.  Spatial pyramid pooling-fast structure

    图  6  特征图可视化

    Figure  6.  Feature map visualization

    图  7  主干网络结构

    Figure  7.  Backbone network structure

    图  8  SCA模块不同分布位置

    Figure  8.  Different distribution positions of shuffle coordinate attention module

    图  9  实际场景检测结果

    Figure  9.  Detection result of actual scenarios

    表  1  S−MobileNetV2结构

    Table  1.   S-MobileNetV2 structure

    输入执行操作扩张系数通道维度步长
    416×416×3Conv2d 3×3322
    208×208×32Bottleneck1161
    208×208×16SCA−Bottleneck×26242
    104×104×24SCA−Bottleneck×36322
    52×52×32Bottleneck×46642
    26×26×64SCA−Bottleneck×36961
    26×26×96SCA−Bottleneck×361602
    13×13×160Conv2d 1×163201
    下载: 导出CSV

    表  2  不同主干网络实验结果

    Table  2.   Experimental results of different backbone networks

    模型 平均精度均值/% 每秒浮点
    运算次数/109
    参数量/
    106
    处理速度/
    (帧·s−1
    VOC SHWD
    M−YOLO 84.71 94.14 60.0 63.9 17.2
    M1−YOLO 79.54 86.92 28.5 39.5 24.3
    M2−YOLO 80.36 88.11 26.1 37.3 26.1
    M3−YOLO 79.06 87.57 25.5 38.3 25.6
    G−YOLO 78.45 85.81 24.9 38.0 29.9
    下载: 导出CSV

    表  3  不同位置SCA模块实验结果

    Table  3.   Results of shuffle coordinate attention module experiments at different positions

    残差模块 平均精度均值/% 处理速度/(帧·s−1
    VOC SHWD
    Bottleneck 80.36 85.91 26.1
    SCA−Bottleneck−1 80.19 87.31 24.3
    SCA−Bottleneck−2 80.98 87.98 23.2
    SCA−Bottleneck−3 81.53 88.75 23.3
    SCA−Bottleneck−4 80.56 86.95 24.0
    下载: 导出CSV

    表  4  消融实验结果

    Table  4.   Ablation experiment results

    模型 S−MobileNetV2 SPPF 重构特征
    融合网络
    平均精度
    均值/%
    处理速度/
    (帧·s−1
    M2−YOLO85.9125.4
    M−YOLO88.7523.3
    89.4726.9
    91.1033.6
    下载: 导出CSV

    表  5  不同模型对比实验结果

    Table  5.   Comparative experimental results of different models

    模型 平均精度均值/% 每秒浮点
    运算次数/
    109
    参数量/
    106
    处理速度/
    (帧·s−1
    模型大
    小/MiB
    VOC SHWD
    SSD[24] 74.06 76.14 60.9 23.8 11.6 99.46
    Efficientdet−d4[25] 76.51 82.14 105.0 20.6 11.2 78.25
    Faster R−CNN[26] 76.86 85.01 369.7 136.7 7.2 523.69
    YOLOv4[12] 84.71 91.94 60.0 63.9 21.9 242.58
    YOLOv5−M 83.47 89.55 50.6 21.2 19.1 77.58
    CenterNet[27] 77.69 89.97 70.2 32.7 23.3 122.28
    YOLOX−M[28] 81.64 88.68 73.7 25.3 15.4 96.44
    DETR[29] 78.05 83.18 114.2 36.7 10.7 156.79
    YOLOX−S[28] 78.51 88.02 26.8 8.9 32.9 33.39
    YOLOv4−tiny[30] 72.24 78.49 6.8 5.9 48.1 22.42
    YOLOv5−S[31] 81.01 87.37 16.5 7.1 30.5 28.9
    Efficientdet−d0[25] 69.22 79.03 4.7 3.8 36.5 15.87
    M−YOLO 83.95 91.10 15.3 17.4 33.6 44.75
    下载: 导出CSV
  • [1] 方伟立,丁烈云. 工人不安全行为智能识别与矫正研究[J]. 华中科技大学学报(自然科学版),2022,50(8):131-135.

    FANG Weili,DING Lieyun. Artificial intelligence-based recognition and modification of workers' unsafe behavior[J]. Journal of Huazhong University of Science and Technology(Natural Science Edition),2022,50(8):131-135.
    [2] 程德强,钱建生,郭星歌,等. 煤矿安全生产视频AI识别关键技术研究综述[J]. 煤炭科学技术,2023,51(2):349-365.

    CHENG Deqiang,QIAN Jiansheng,GUO Xingge,et al. Review on key technologies of AI recognition for videos in coal mine[J]. Coal Science and Technology,2023,51(2):349-365.
    [3] 程德强,徐进洋,寇旗旗,等. 融合残差信息轻量级网络的运煤皮带异物分类[J]. 煤炭学报,2022,47(3):1361-1369.

    CHENG Deqiang,XU Jinyang,KOU Qiqi,et al. Lightweight network based on residual information for foreign body classification on coal conveyor belt[J]. Journal of China Coal Society,2022,47(3):1361-1369.
    [4] 李琪瑞. 基于人体识别的安全帽视频检测系统研究与实现[D]. 成都:电子科技大学,2017.

    LI Qirui. A research and implementation of safety-helmet video detection system based on human body recognition[D]. Chengdu:University of Electronic Science and Technology of China,2017.
    [5] SUN Xiaoming,XU Kaige,WANG Sen,et al. Detection and tracking of safety helmet in factory environment[J]. Measurement Science and Technology,2021,32(10). DOI: 10.1088/1361-6501/ac06ff.
    [6] LI Tan,LYU Xinyue,LIAN Xiaofeng,et al. YOLOv4_Drone:UAV image target detection based on an improved YOLOv4 algorithm[J]. Computers & Electrical Engineering,2021,93(8). DOI: 10.1016/j.compeleceng.2021.107261.
    [7] 徐守坤,王雅如,顾玉宛,等. 基于改进Faster RCNN的安全帽佩戴检测研究[J]. 计算机应用研究,2020,37(3):901-905.

    XU Shoukun,WANG Yaru,GU Yuwan,et al. Safety helmet wearing detection study based on improved Faster RCNN[J]. Application Research of Computers,2020,37(3):901-905.
    [8] WANG Xuanyu,NIU Dan,LUO Puxuan,et al. A safety helmet and protective clothing detection method based on improved-YoloV3[C]. Chinese Automation Congress,Shanghai,2020:5437-5441.
    [9] 罗欣宇. 基于深度学习的工地安全防护检测系统[D]. 杭州:杭州电子科技大学,2020.

    LUO Xinyu. Construction site safety protection detection system based on deep learning[D]. Hangzhou:Hangzhou Dianzi University,2020.
    [10] 梁思成. 基于卷积神经网络的安全帽检测研究[D]. 哈尔滨:哈尔滨工业大学,2021.

    LIANG Sicheng. Research on safety helmet wearing detection based on convolutional neural network[D]. Harbin:Harbin Institute of Technology,2021.
    [11] 张培基. 工业监控视频中的安全服与安全帽检测方法研究[D]. 武汉:华中科技大学,2021.

    ZHANG Peiji. Research on detection methods of safety clothing and safety helmet in industrial surveillance video[D]. Wuhan:Huazhong University of Science and Technology,2021.
    [12] BOCHKOVSKIY A,WANG C Y,LIAO H Y M. YOLOv4:optimal speed and accuracy of object detection[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Seattle,2020:9-12.
    [13] SANDLER M,HOWARD A,ZHU Menglong,et al. MobileNetV2:inverted residuals and linear bottlenecks[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,2018:4510-4520.
    [14] HE Kaiming,ZHANG Xiangyu,REN Shaoqing,et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916. doi: 10.1109/TPAMI.2015.2389824
    [15] LIU Shu,QI Lu,QIN Haifang,et al. Path aggregation network for instance segmentation[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,2018:8759-8768.
    [16] HOWARD A,SANDLER M,CHEN Bo,et al. Searching for MobileNetV3[C]. IEEE/CVF International Conference on Computer Vision,Seoul,2019:1314-1324.
    [17] HAN Kai,WANG Yunhe,TIAN Qi,et al. GhostNet:more features from cheap operations[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Seattle,2020:1577-1586.
    [18] HU Jie,SHEN Li,SUN Gang. Squeeze-and-excitation networks[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,2018:7132-7141.
    [19] WOO S H,PARK J Y,LEE J Y,et al. CBAM:convolutional block attention module[C]. European Conference on Computer Vision,Munich,2018:3-19.
    [20] HOU Qibin,ZHOU Daquan,FENG Jiashi. Coordinate attention for efficient mobile network design[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Nashville,2021:13708-13717.
    [21] ZHANG Xiangyu,ZHOU Xinyu,LIN Mengxiao,et al. ShuffleNet:an extremely efficient convolutional neural network for mobile devices[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,2018:6848-6856.
    [22] 寇旗旗,黄绩,程德强,等. 基于语义融合的域内相似性分组行人重识别[J]. 通信学报,2022,43(7):153-162.

    KOU Qiqi,HUANG Ji,CHENG Deqiang,et al. Person re-identification with intra-domain similarity grouping based on semantic fusion[J]. Journal on Communications,2022,43(7):153-162.
    [23] CHENG Deqiang,CHEN Liangliang,LYU Chen,et al. Light-guided and cross-fusion U-Net for anti-illumination image super-resolution[J]. IEEE Transactions on Circuits and Systems for Video Technology,2022,32(12):8436-8449. doi: 10.1109/TCSVT.2022.3194169
    [24] LIU Wei,ANGUELOV D,ERHAN D,et al. SSD:single shot multibox detector[C]. European Conference on Computer Vision,Amsterdam,2016:21-37.
    [25] TAN Mingxing,PANG Ruoming,QUOC V L. EfficientDet:scalable and efficient object detection[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Seattle,2020:10781-10790.
    [26] REN Shaoqing,HE Kaiming,GIRSHICK R,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031
    [27] DUAN Kaiwen,BAI Song,XIE Lingxi,et al. CenterNet:keypoint triplets for object detection[C]. IEEE/CVF International Conference on Computer Vision,Seoul,2019:6568-6577.
    [28] GE Zheng,LIU Songtao,WANG Feng,et al. YOLOX:exceeding YOLO series in 2021[EB/OL].[2023-08-03]. https://arxiv.org/abs/2107.08430.
    [29] NICOLAS C,FRANCISCO M,GABRIEL S,et al. End-to-end object detection with transformers[C]. European Conference on Computer Vision,Glasgow,2020:213-229.
    [30] WANG C Y,BOCHKOVSKIY A,LIAO H Y M. Scaled-YOLOv4:scaling cross stage partial network[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Nashville,2021:13024-13033.
    [31] Ultralytics. YOLOv5[EB/OL]. [2023-08-12]. https://github.com/ultralytics/yolov5.
  • 加载中
图(9) / 表(5)
计量
  • 文章访问数:  167
  • HTML全文浏览量:  60
  • PDF下载量:  32
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-08-31
  • 修回日期:  2023-11-21
  • 网络出版日期:  2023-11-27

目录

    /

    返回文章
    返回