留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多模态的井下登高作业专人扶梯检测方法

孙晴 杨超宇

孙晴,杨超宇. 基于多模态的井下登高作业专人扶梯检测方法[J]. 工矿自动化,2024,50(5):142-150.  doi: 10.13272/j.issn.1671-251x.2024010068
引用本文: 孙晴,杨超宇. 基于多模态的井下登高作业专人扶梯检测方法[J]. 工矿自动化,2024,50(5):142-150.  doi: 10.13272/j.issn.1671-251x.2024010068
SUN Qing, YANG Chaoyu. A multi-modal detection method for holding ladders in underground climbing operations[J]. Journal of Mine Automation,2024,50(5):142-150.  doi: 10.13272/j.issn.1671-251x.2024010068
Citation: SUN Qing, YANG Chaoyu. A multi-modal detection method for holding ladders in underground climbing operations[J]. Journal of Mine Automation,2024,50(5):142-150.  doi: 10.13272/j.issn.1671-251x.2024010068

基于多模态的井下登高作业专人扶梯检测方法

doi: 10.13272/j.issn.1671-251x.2024010068
基金项目: 国家自然科学基金项目(61873004)。
详细信息
    作者简介:

    孙晴(2000—),女,河南新乡人,硕士研究生,研究方向为煤矿井下不安全行为智能识别,E-mail:beryl2022@163.com

  • 中图分类号: TD67

A multi-modal detection method for holding ladders in underground climbing operations

  • 摘要: 目前大多数的井下人员不安全行为识别研究侧重于在计算机视觉上提高精度,但井下易出现遮挡、光照不稳定、反光等情况,仅采用计算机视觉技术难以实现对不安全行为的准确识别,尤其登高作业中的爬梯、扶梯等相似动作在识别过程中易被混淆,存在安全隐患。针对上述问题,提出一种基于多模态的井下登高作业专人扶梯检测方法。该方法从视觉和音频2个模态对监控视频数据进行分析。视觉模态方面,采用YOLOv8模型检测登高梯是否存在,如果存在,获得登高梯的位置坐标,并将视频段放入OpenPose算法中进行姿态估计,得到人体的各个骨骼关节点的特征,将这些骨骼关节点序列放入改进的时空注意图卷积网络(SAT−GCN)中,得到人体动作标签及其对应概率。音频模态方面,采用飞桨自动语言识别系统将语音转换为文本,使用双向编码器表示(BERT)模型对文本信息进行特征分析与提取,得到文本标签及其对应的概率。最后将视觉模态与音频模态得到的信息进行决策级融合,判断井下登高作业是否有专人扶梯。实验结果表明:基于骨架数据的动作识别中,优化后的SAT−GCN模型对于扶梯、爬梯、站立3种动作的识别精度分别提升了3.36%,2.83%,10.71%;基于多模态的检测方法比单模态方法具有更高的识别准确率,达到98.29%。

     

  • 图  1  基于多模态的井下登高作业专人扶梯检测方法技术路线

    Figure  1.  Technical route of multi-modal detection method for holding ladders in underground climbing operations

    图  2  YOLOv8网络结构

    Figure  2.  Network architecture of YOLOv8

    图  3  OpenPose网络结构

    Figure  3.  Network architecture of OpenPose

    图  4  井下人员骨骼关节点

    Figure  4.  Skeletal joint points for underground personnel

    图  5  ST−GCN结构

    Figure  5.  Architecture of ST-GCN

    图  6  SAT−GCN的单个时空图卷积单元组成

    Figure  6.  Composition of single space-time graph convolutional units in SAT-GCN

    图  7  SAM结构

    Figure  7.  Structure of spatial attention module (SAM)

    图  8  BERT模型结构

    Figure  8.  Model architecture of BERT

    图  9  登高作业不安全行为判别流程

    Figure  9.  Recognition flow of unsafe behavior in climbing operations

    图  10  可视化训练结果

    Figure  10.  Visualize training results

    图  11  不同环境下对登高梯的识别效果

    Figure  11.  Recognition effects of ladder in different environments

    图  12  井下登高作业有无专人扶梯行为识别结果

    Figure  12.  Recognition results of whether there is a personnel holding ladders in underground climbing operations

    图  13  SAT−GCN模型与ST−GCN模型损失曲线

    Figure  13.  Loss curves of SAT-GCN model and ST-GCN model

    表  1  部分文本训练数据

    Table  1.   Partial text training data

    数据标签
    你自己小心点上去就行,不用找人扶梯子,没必要的。
    我上次就是一个人上去的,没问题的。
    0
    我来帮你扶一下梯子吧。不用,我熟练得很,没事。0
    我现在要搬个梯子上去维修一下顶板的支架,你来帮我扶着吧。1
    那地面不太平整,梯子放不稳,我来帮你扶着。1
    下载: 导出CSV

    表  2  模型对不同动作的识别精度对比

    Table  2.   Comparison of recognition precision of models on different actions %

    动作类别 ST−GCN SAT−GCN
    扶梯 94.31 97.67
    爬梯 72.89 75.72
    站立 75.00 85.71
    下载: 导出CSV

    表  3  模型在自建数据集上的实验对比

    Table  3.   Experimental comparison of models on self-builting datasets %

    模型 准确率
    ST−GCN 80.39
    SAT−GCN 82.35
    VA−RNN 77.25
    2s−AGCN 84.97
    基于多模态融合的井下登高作业专人扶梯检测模型 98.29
    下载: 导出CSV
  • [1] 张瑜,冯仕民,杨赛烽,等. 矿工不安全行为影响因素本体构建与推理研究[J]. 煤矿安全,2019,50(5):300-304.

    ZHANG Yu,FENG Shimin,YANG Saifeng,et al. Ontology construction and reasoning research on influencing factors of miners' unsafe behavior[J]. Safety in Coal Mines,2019,50(5):300-304.
    [2] 登高作业操作规程[EB/OL]. (2021-11-08)[2023-10-08]. https://www.mkaq.org/html/2021/11/08/593666.shtml.

    Operation procedures for climbingoperations[EB/OL]. (2021-11-08)[2023-10-08]. https://www.mkaq.org/html/2021/11/08/593666.shtml.
    [3] 刘浩,刘海滨,孙宇,等. 煤矿井下员工不安全行为智能识别系统[J]. 煤炭学报,2021,46(增刊2):1159-1169.

    LIU Hao,LIU Haibin,SUN Yu,et al. Intelligent recognition system of unsafe behavior of underground coal miners[J]. Journal of China Coal Society,2021,46(S2):1159-1169.
    [4] 饶天荣,潘涛,徐会军. 基于交叉注意力机制的煤矿井下不安全行为识别[J]. 工矿自动化,2022,48(10):48-54.

    RAO Tianrong,PAN Tao,XU Huijun. Unsafe action recognition in underground coal mine based on cross-attention mechanism[J]. Journal of Mine Automation,2022,48(10):48-54.
    [5] 王宇,于春华,陈晓青,等. 基于多模态特征融合的井下人员不安全行为识别[J]. 工矿自动化,2023,49(11):138-144.

    WANG Yu,YU Chunhua,CHEN Xiaoqing,et al. Recognition of unsafe behaviors of underground personnel based on multi modal feature fusion[J]. Journal of Mine Automation,2023,49(11):138-144.
    [6] 赵登阁,智敏. 用于人体动作识别的多尺度时空图卷积算法[J]. 计算机科学与探索,2023,17(3):719-732. doi: 10.3778/j.issn.1673-9418.2106102

    ZHAO Dengge,ZHI Min. Spatial multiple-temporal graph convolutional neural network for human action recognition[J]. Journal of Frontiers of Computer Science and Technology,2023,17(3):719-732. doi: 10.3778/j.issn.1673-9418.2106102
    [7] LI Peilin,WU Fan,XUE Shuhua,et al. Study on the interaction behaviors identification of construction workers based on ST-GCN and YOLO[J]. Sensors,2023,23(14). DOI: 10.3390/S23146318.
    [8] SHI Xiaonan,HUANG Jian,HUANG Bo. An underground abnormal behavior recognition method based on an optimized Alphapose-ST-GCN[J]. Journal of Circuits,Systems and Computers,2022,31(12). DOI: 10.1142/S0218126622502140.
    [9] 苏晨阳,武文红,牛恒茂,等. 深度学习的工人多种不安全行为识别方法综述[J]. 计算机工程与应用,2024,60(5):30-46.

    SU Chenyang,WU Wenhong,NIU Hengmao,et al. Review of deep learning approaches for recognizing multiple unsafe behaviors in workers[J]. Computer Engineering and Applications,2024,60(5):30-46.
    [10] YAN Sijie,XIONG Yuanjun,LIN Dahua. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]. AAAI Conference on Artificial Intelligence,New Orleans,2018:7444-7452.
    [11] SONG Sijie,LAN Cuiling,XING Junliang,et al. Spatio-temporal attention-based LSTM networks for 3D action recognition and detection[J]. IEEE Transactions on Image Processing,2018,27(7):3459-3471. doi: 10.1109/TIP.2018.2818328
    [12] CAO Zhe,SIMON T,WEI S E,et al. Realtime multi-person 2D pose estimation using part affinity fields[C]. IEEE Conference on Computer Vision and Pattern Recognition,Honolulu,2017:1302-1310.
    [13] 许奇珮. 基于ST−GCN的人体骨架动作识别方法研究[D]. 长春:长春工业大学,2023.

    XU Qipei. Research on human skeleton action recognition method based on ST-GCN[D]. Changchun:Changchun University of Technology,2023.
    [14] DEVLIN J,CHANG Mingwei,LEE K,et al. BERT:pre-training of deep bidirectional transformers for language understanding[EB/OL]. (2018-10-11)[2023-10-08]. https://doi.org/10.48550/arXiv.1810.04805.
    [15] LIU Shu,QI Lu,QIN Haifeng,et al. Path aggregation network for instance segmentation[C]. IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,2018:8759-8768.
    [16] 李雯静,刘鑫. 基于深度学习的井下人员不安全行为识别与预警系统研究[J]. 金属矿山,2023(3):177-184.

    LI Wenjing,LIU Xin. Research on underground personnel unsafe behavior identification and early warning system based on deep learning[J]. Metal Mine,2023(3):177-184.
    [17] 刘耀,焦双健. ST−GCN在建筑工人不安全动作识别中的应用[J]. 中国安全科学学报,2022,32(4):30-35.

    LIU Yao,JIAO Shuangjian. Application of ST-GCN in unsafe action identification of construction workers[J]. China Safety Science Journal,2022,32(4):30-35.
    [18] WOO S,PARK J,LEE J Y,et al. CBAM:convolutional block attention module[C]. European Conference on Computer Vision,Cham,2018:3-19.
    [19] 景永霞,苟和平,刘强. 基于BERT语义分析的短文本分类研究[J]. 兰州文理学院学报(自然科学版),2023,37(6):46-49.

    JING Yongxia,GOU Heping,LIU Qiang. Classification study on online short text based on BERT semantic analysis[J]. Journal of Lanzhou University of Arts and Science(Natural Sciences),2023,37(6):46-49.
    [20] 姜长三,曾桢,万静. 多源信息融合研究进展综述[J]. 现代计算机,2023,29(18):1-9,29.

    JIANG Changsan,ZENG Zhen,WAN Jing. A review of research advances in multi-source information fusion[J]. Modern Computer,2023,29(18):1-9,29.
    [21] ZHANG Pengfei,LAN Cuiling,XING Junliang,et al. View adaptive neural networks for high performance skeleton-based human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(8):1963-1978. doi: 10.1109/TPAMI.2019.2896631
    [22] SHI Lei,ZHANG Yifan,CHENG Jian,et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Long Beach,2019:12018-12027.
  • 加载中
图(13) / 表(3)
计量
  • 文章访问数:  97
  • HTML全文浏览量:  22
  • PDF下载量:  15
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-01-22
  • 修回日期:  2024-05-20
  • 网络出版日期:  2024-06-13

目录

    /

    返回文章
    返回