融合坐标注意力与多尺度特征的轻量级安全帽佩戴检测

李忠飞; 冯仕咏; 郭骏; 张云鹤; 徐飞翔

doi:10.13272/j.issn.1671-251x.2023080123

融合坐标注意力与多尺度特征的轻量级安全帽佩戴检测

1.
内蒙古电投能源股份有限公司北露天煤矿，内蒙古霍林郭勒　029200
2.
中国矿业大学信息与控制工程学院，江苏徐州　221116
3.
北京和利时数字技术有限公司，北京　100176

基金项目: 国家重点研发计划项目（2021YFC2902702）。

详细信息

作者简介:
李忠飞（1981—），男，内蒙古通辽人，高级工程师，硕士，现从事矿山机电一体化、信息化与智能化方面的研究工作，E-mail：zhongfei_li@sohu.com

中图分类号: TD67
计量
- 文章访问数: 187
- HTML全文浏览量: 61
- PDF下载量: 37
出版历程
- 收稿日期: 2023-08-30
- 修回日期: 2023-11-20
- 网络出版日期: 2023-11-26
- 刊出日期: 2023-11-24

Lightweight safety helmet wearing detection fusing coordinate attention and multiscale feature

1.
North Open-pit Coal Mine, Inner Mongolia Power Investment Energy Co., Ltd., Holingola 029200, China
2.
School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
3.
Beijing Hollysys Digital Technology Co., Ltd., Beijing 100176, China

摘要

摘要: 针对现有煤矿工人安全帽佩戴检测算法存在检测精度与速度难以取得较好平衡的问题，以YOLOv4模型为基础，提出了一种融合坐标注意力与多尺度的轻量级模型M−YOLO，并将其用于安全帽佩戴检测。该模型使用融入混洗坐标注意力模块的轻量化特征提取网络S−MobileNetV2替换YOLOv4的特征提取网络CSPDarknet53，在减少相关参数量的前提下，有效改善了特征之间的联系；将原有空间金字塔池化结构中的并行连接方式改为串行连接，有效提高了计算效率；对特征融合网络进行改进，引入具有高分辨率、多细节纹理信息的浅层特征，以有效加强对检测目标特征的提取，并将原有Neck结构中的部分卷积修改为深度可分离卷积，在保证检测精度的前提下进一步降低了模型的参数量和计算量。实验结果表明，与YOLOv4模型相比，M−YOLO模型的平均精度均值仅降低了0.84%，但计算量、参数量、模型大小分别减小了74.5%，72.8%，81.6%，检测速度提高了53.4%；相较于其他模型，M−YOLO模型在准确率和实时性方面取得了良好的平衡，满足在智能视频监控终端上嵌入式加载和部署的需求。
- 目标检测 /
- 安全帽佩戴检测 /
- 坐标注意力模块 /
- 轻量化 /
- 多尺度特征融合
Abstract: The existing algorithm for detecting the helmet wear by coal miners has the problem of difficulty in achieving a good balance between detection accuracy and speed. In order to solve the above problem, based on the YOLOv4 model, a lightweight model (M-YOLO) that integrates coordinate attention and multi-scale is proposed and applied in safety helmet wearing detection. This model replaces YOLOv4's feature extraction network CSPDarknet53 with a lightweight feature extraction network S-MobileNetV2 composed of a mixed coordinate attention module. It effectively improves the connection between features while reducing the number of related parameters. The model changes the parallel connection method in the original spatial pyramid pooling structure to serial connection. It effectively improves computational efficiency. The feature fusion network is improved by introducing shallow features with high-resolution and multi detail texture information. It effectively enhances the extraction of object features. Some convolutions in the original Neck structure are modified to deep separable convolutions, further reducing the model's parameter and computational complexity while ensuring detection precision. The experimental results show that compared with the YOLOv4 model, the mean average precision of the M-YOLO model is only reduced by 0.84%. But the computational complexity, parameter quantity, and model size are reduced by 74.5%, 72.8%, and 81.6%, respectively. The detection speed is improved by 53.4%. Compared to other models, the M-YOLO model achieves a good balance between accuracy and real-time performance, meeting the requirements of embedded loading and deployment on intelligent video surveillance terminals.
- object detection /
- safety helmet wearing detection /
- coordinate attention module /
- lightweight /
- multiscale feature fusion

HTML全文

图 1 M−YOLO结构

Figure 1. M-YOLO structure

下载: 全尺寸图片幻灯片

图 2 坐标注意力模块结构

Figure 2. Coordinate attention module structure

下载: 全尺寸图片幻灯片

图 3 SCA模块结构

Figure 3. Shuffle coordinate attention module structure

下载: 全尺寸图片幻灯片

图 4 SPP结构

Figure 4. Spatial pyramid pooling structure

下载: 全尺寸图片幻灯片

图 5 SPPF结构

Figure 5. Spatial pyramid pooling-fast structure

下载: 全尺寸图片幻灯片

图 6 特征图可视化

Figure 6. Feature map visualization

下载: 全尺寸图片幻灯片

图 7 主干网络结构

Figure 7. Backbone network structure

下载: 全尺寸图片幻灯片

图 8 SCA模块不同分布位置

Figure 8. Different distribution positions of shuffle coordinate attention module

下载: 全尺寸图片幻灯片

图 9 实际场景检测结果

Figure 9. Detection result of actual scenarios

下载: 全尺寸图片幻灯片

表 1 S−MobileNetV2结构

Table 1 S-MobileNetV2 structure

输入	执行操作	扩张系数	通道维度	步长
416×416×3	Conv2d 3×3	—	32	2
208×208×32	Bottleneck	1	16	1
208×208×16	SCA−Bottleneck×2	6	24	2
104×104×24	SCA−Bottleneck×3	6	32	2
52×52×32	Bottleneck×4	6	64	2
26×26×64	SCA−Bottleneck×3	6	96	1
26×26×96	SCA−Bottleneck×3	6	160	2
13×13×160	Conv2d 1×1	6	320	1

下载: 导出CSV

表 2 不同主干网络实验结果

Table 2 Experimental results of different backbone networks

模型	平均精度均值/%		每秒浮点运算次数/10⁹	参数量/ 10⁶个	处理速度/ （帧·s⁻¹）
模型	VOC	SHWD	每秒浮点运算次数/10⁹	参数量/ 10⁶个	处理速度/ （帧·s⁻¹）
M−YOLO	84.71	94.14	60.0	63.9	17.2
M1−YOLO	79.54	86.92	28.5	39.5	24.3
M2−YOLO	80.36	88.11	26.1	37.3	26.1
M3−YOLO	79.06	87.57	25.5	38.3	25.6
G−YOLO	78.45	85.81	24.9	38.0	29.9

下载: 导出CSV

表 3 不同位置SCA模块实验结果

Table 3 Results of shuffle coordinate attention module experiments at different positions

残差模块	平均精度均值/%		处理速度/（帧·s⁻¹）
残差模块	VOC	SHWD	处理速度/（帧·s⁻¹）
Bottleneck	80.36	85.91	26.1
SCA−Bottleneck−1	80.19	87.31	24.3
SCA−Bottleneck−2	80.98	87.98	23.2
SCA−Bottleneck−3	81.53	88.75	23.3
SCA−Bottleneck−4	80.56	86.95	24.0

下载: 导出CSV

表 4 消融实验结果

Table 4 Ablation experiment results

模型	S−MobileNetV2	SPPF	重构特征融合网络	平均精度均值/%	处理速度/ （帧·s⁻¹）
M2−YOLO				85.91	25.4
M−YOLO	√			88.75	23.3
	√	√		89.47	26.9
	√	√	√	91.10	33.6

下载: 导出CSV

表 5 不同模型对比实验结果

Table 5 Comparative experimental results of different models

模型	平均精度均值/%		每秒浮点运算次数/ 10⁹	参数量/ 10⁶个	处理速度/ （帧·s⁻¹）	模型大小/MiB
模型	VOC	SHWD	每秒浮点运算次数/ 10⁹	参数量/ 10⁶个	处理速度/ （帧·s⁻¹）	模型大小/MiB
SSD^[24]	74.06	76.14	60.9	23.8	11.6	99.46
Efficientdet−d4^[25]	76.51	82.14	105.0	20.6	11.2	78.25
Faster R−CNN^[26]	76.86	85.01	369.7	136.7	7.2	523.69
YOLOv4^[12]	84.71	91.94	60.0	63.9	21.9	242.58
YOLOv5−M	83.47	89.55	50.6	21.2	19.1	77.58
CenterNet^[27]	77.69	89.97	70.2	32.7	23.3	122.28
YOLOX−M^[28]	81.64	88.68	73.7	25.3	15.4	96.44
DETR^[29]	78.05	83.18	114.2	36.7	10.7	156.79
YOLOX−S^[28]	78.51	88.02	26.8	8.9	32.9	33.39
YOLOv4−tiny^[30]	72.24	78.49	6.8	5.9	48.1	22.42
YOLOv5−S^[31]	81.01	87.37	16.5	7.1	30.5	28.9
Efficientdet−d0^[25]	69.22	79.03	4.7	3.8	36.5	15.87
M−YOLO	83.95	91.10	15.3	17.4	33.6	44.75

下载: 导出CSV

参考文献(31)

[1]	方伟立,丁烈云. 工人不安全行为智能识别与矫正研究[J]. 华中科技大学学报(自然科学版),2022,50(8):131-135. FANG Weili,DING Lieyun. Artificial intelligence-based recognition and modification of workers' unsafe behavior[J]. Journal of Huazhong University of Science and Technology(Natural Science Edition),2022,50(8):131-135.
[2]	程德强,钱建生,郭星歌,等. 煤矿安全生产视频AI识别关键技术研究综述[J]. 煤炭科学技术,2023,51(2):349-365. CHENG Deqiang,QIAN Jiansheng,GUO Xingge,et al. Review on key technologies of AI recognition for videos in coal mine[J]. Coal Science and Technology,2023,51(2):349-365.
[3]	程德强,徐进洋,寇旗旗,等. 融合残差信息轻量级网络的运煤皮带异物分类[J]. 煤炭学报,2022,47(3):1361-1369. CHENG Deqiang,XU Jinyang,KOU Qiqi,et al. Lightweight network based on residual information for foreign body classification on coal conveyor belt[J]. Journal of China Coal Society,2022,47(3):1361-1369.
[4]	李琪瑞. 基于人体识别的安全帽视频检测系统研究与实现[D]. 成都:电子科技大学,2017. LI Qirui. A research and implementation of safety-helmet video detection system based on human body recognition[D]. Chengdu:University of Electronic Science and Technology of China,2017.
[5]	SUN Xiaoming,XU Kaige,WANG Sen,et al. Detection and tracking of safety helmet in factory environment[J]. Measurement Science and Technology,2021,32(10). DOI: 10.1088/1361-6501/ac06ff.
[6]	LI Tan,LYU Xinyue,LIAN Xiaofeng,et al. YOLOv4_Drone:UAV image target detection based on an improved YOLOv4 algorithm[J]. Computers & Electrical Engineering,2021,93(8). DOI: 10.1016/j.compeleceng.2021.107261.
[7]	徐守坤,王雅如,顾玉宛,等. 基于改进Faster RCNN的安全帽佩戴检测研究[J]. 计算机应用研究,2020,37(3):901-905. XU Shoukun,WANG Yaru,GU Yuwan,et al. Safety helmet wearing detection study based on improved Faster RCNN[J]. Application Research of Computers,2020,37(3):901-905.
[8]	WANG Xuanyu,NIU Dan,LUO Puxuan,et al. A safety helmet and protective clothing detection method based on improved-YoloV3[C]. Chinese Automation Congress,Shanghai,2020:5437-5441.
[9]	罗欣宇. 基于深度学习的工地安全防护检测系统[D]. 杭州:杭州电子科技大学,2020. LUO Xinyu. Construction site safety protection detection system based on deep learning[D]. Hangzhou:Hangzhou Dianzi University,2020.
[10]	梁思成. 基于卷积神经网络的安全帽检测研究[D]. 哈尔滨:哈尔滨工业大学,2021. LIANG Sicheng. Research on safety helmet wearing detection based on convolutional neural network[D]. Harbin:Harbin Institute of Technology,2021.
[11]	张培基. 工业监控视频中的安全服与安全帽检测方法研究[D]. 武汉:华中科技大学,2021. ZHANG Peiji. Research on detection methods of safety clothing and safety helmet in industrial surveillance video[D]. Wuhan:Huazhong University of Science and Technology,2021.
[12]	BOCHKOVSKIY A,WANG C Y,LIAO H Y M. YOLOv4:optimal speed and accuracy of object detection[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Seattle,2020:9-12.
[13]	SANDLER M,HOWARD A,ZHU Menglong,et al. MobileNetV2:inverted residuals and linear bottlenecks[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,2018:4510-4520.
[14]	HE Kaiming,ZHANG Xiangyu,REN Shaoqing,et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916. DOI: 10.1109/TPAMI.2015.2389824
[15]	LIU Shu,QI Lu,QIN Haifang,et al. Path aggregation network for instance segmentation[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,2018:8759-8768.
[16]	HOWARD A,SANDLER M,CHEN Bo,et al. Searching for MobileNetV3[C]. IEEE/CVF International Conference on Computer Vision,Seoul,2019:1314-1324.
[17]	HAN Kai,WANG Yunhe,TIAN Qi,et al. GhostNet:more features from cheap operations[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Seattle,2020:1577-1586.
[18]	HU Jie,SHEN Li,SUN Gang. Squeeze-and-excitation networks[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,2018:7132-7141.
[19]	WOO S H,PARK J Y,LEE J Y,et al. CBAM:convolutional block attention module[C]. European Conference on Computer Vision,Munich,2018:3-19.
[20]	HOU Qibin,ZHOU Daquan,FENG Jiashi. Coordinate attention for efficient mobile network design[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Nashville,2021:13708-13717.
[21]	ZHANG Xiangyu,ZHOU Xinyu,LIN Mengxiao,et al. ShuffleNet:an extremely efficient convolutional neural network for mobile devices[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City,2018:6848-6856.
[22]	寇旗旗,黄绩,程德强,等. 基于语义融合的域内相似性分组行人重识别[J]. 通信学报,2022,43(7):153-162. KOU Qiqi,HUANG Ji,CHENG Deqiang,et al. Person re-identification with intra-domain similarity grouping based on semantic fusion[J]. Journal on Communications,2022,43(7):153-162.
[23]	CHENG Deqiang,CHEN Liangliang,LYU Chen,et al. Light-guided and cross-fusion U-Net for anti-illumination image super-resolution[J]. IEEE Transactions on Circuits and Systems for Video Technology,2022,32(12):8436-8449. DOI: 10.1109/TCSVT.2022.3194169
[24]	LIU Wei,ANGUELOV D,ERHAN D,et al. SSD:single shot multibox detector[C]. European Conference on Computer Vision,Amsterdam,2016:21-37.
[25]	TAN Mingxing,PANG Ruoming,QUOC V L. EfficientDet:scalable and efficient object detection[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Seattle,2020:10781-10790.
[26]	REN Shaoqing,HE Kaiming,GIRSHICK R,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. DOI: 10.1109/TPAMI.2016.2577031
[27]	DUAN Kaiwen,BAI Song,XIE Lingxi,et al. CenterNet:keypoint triplets for object detection[C]. IEEE/CVF International Conference on Computer Vision,Seoul,2019:6568-6577.
[28]	GE Zheng,LIU Songtao,WANG Feng,et al. YOLOX:exceeding YOLO series in 2021[EB/OL].[2023-08-03]. https://arxiv.org/abs/2107.08430.
[29]	NICOLAS C,FRANCISCO M,GABRIEL S,et al. End-to-end object detection with transformers[C]. European Conference on Computer Vision,Glasgow,2020:213-229.
[30]	WANG C Y,BOCHKOVSKIY A,LIAO H Y M. Scaled-YOLOv4:scaling cross stage partial network[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Nashville,2021:13024-13033.
[31]	Ultralytics. YOLOv5[EB/OL]. [2023-08-12]. https://github.com/ultralytics/yolov5.

施引文献

资源附件(0)

图(9) / 表(5)

计量

文章访问数: 187
HTML全文浏览量: 61
PDF下载量: 37
被引次数: 0

融合坐标注意力与多尺度特征的轻量级安全帽佩戴检测

作者简介: 李忠飞（1981—），男，内蒙古通辽人，高级工程师，硕士，现从事矿山机电一体化、信息化与智能化方面的研究工作，E-mail：zhongfei_li@sohu.com

计量

出版历程

Lightweight safety helmet wearing detection fusing coordinate attention and multiscale feature

计量

出版历程

目录

作者简介:
李忠飞（1981—），男，内蒙古通辽人，高级工程师，硕士，现从事矿山机电一体化、信息化与智能化方面的研究工作，E-mail：zhongfei_li@sohu.com