融合特征增强与上下文感知的井下安全帽佩戴检测方法

杜岗

doi:10.13272/j.issn.1671-251x.2025050008

融合特征增强与上下文感知的井下安全帽佩戴检测方法

杜岗

Fusion feature enhancement and context-aware method for underground safety helmet wearing detection

DU Gang

摘要

摘要: 安全帽在井下场景中随人员移动，远距离拍摄时目标尺寸较小，检测难度显著增加；井下环境复杂，光照不足、粉尘干扰、遮挡严重和背景杂乱等因素会干扰特征提取过程，降低检测的准确性和稳定性；现有轻量化和加速策略在提升速度的同时，往往损害了模型对细节和小目标的刻画能力，使得检测精度不足。针对上述问题，提出了一种融合特征增强与上下文感知的井下安全帽佩戴检测方法。首先，引入特征增强模块（NFEM），通过多分支卷积与空洞卷积结构提高小目标的语义特征提取能力，使模型能够在弱光、遮挡或粉尘环境下获得更具判别性的特征表达。然后，引入空间特征融合模块（NFFM），在多尺度特征融合过程中利用通道加权策略对特征进行自适应调整，在不显著增加计算量的前提下提高检测精度。最后，引入改进空间上下文感知模块（ISCAM），采用位置敏感的全局上下文建模机制，强化特征间的空间与通道依赖关系，有效提升模型对弱纹理小目标的检测能力，并增强对复杂背景的抑制效果。实验结果表明：① 所提方法在 CUMT−HelmeT 数据集上的 mAP@0.5 达到 0.86，单帧检测时间仅为 10.4 ms；在 SHWD 数据集上的mAP@0.5 达到0.88，单帧检测时间为 12.2 ms。② 在强光干扰、远距离小目标、安全帽相互遮挡等复杂场景中，所提方法较 YOLOv12s 目标检测方法表现出更高的检测置信度和更低的漏检率。③ 所提方法能够有效引导模型关注关键目标，抑制背景干扰，从而提升检测精度和可靠性。

Abstract: In underground scenarios, safety helmets move along with personnel, and when captured from a long distance, the target size is relatively small, which significantly increases the detection difficulty. The underground environment is complex, where insufficient lighting, dust interference, severe occlusion, and cluttered backgrounds can disturb the feature extraction process, reducing detection accuracy and stability. Existing lightweight and acceleration strategies, while improving speed, often compromise the model's ability to represent details and small targets, leading to insufficient detection accuracy. To address these issues, a safety helmet wearing detection method for underground scenarios was proposed, integrating feature enhancement and context awareness. First, a novel feature enhancement module (NFEM) was introduced, which optimized the semantic feature extraction capability for small targets through multi-branch convolution and dilated convolution structures, enabling the model to obtain more discriminative feature representations under low light, occlusion, or dusty conditions. Then, a novel feature fusion module (NFFM) was introduced, which adaptively adjusted features using a channel-weighting strategy during multi-scale feature fusion, thereby improving detection accuracy without significantly increasing computational cost. Finally, an improved spatial context-aware module (ISCAM) was incorporated, which adopted a position-sensitive global context modeling mechanism to strengthen the spatial and channel dependencies among features, effectively enhancing the model's ability to detect weak-texture small targets and suppress complex backgrounds. Experimental results showed that: ① the proposed method achieved an mAP@0.5 of 0.86 on the CUMT-HelmeT dataset with a single-frame detection time of only 10.4 ms; and an mAP@0.5 of 0.88 on the SHWD dataset with a single-frame detection time of 12.2 ms. ② In complex scenarios such as strong light interference, long-distance small targets, and mutual occlusion of helmets, the proposed method exhibited higher detection confidence and lower missed detection rates compared with the YOLOv12s object detection method. ③ The proposed method could effectively guide the model to focus on key targets and suppress background interference, thereby improving detection accuracy and reliability.

HTML全文

参考文献(22)

施引文献

资源附件(0)