基于改进双流法的井下配电室巡检行为识别

Inspection behavior recognition of underground power distribution room based on improved two-stream CNN method

  • 摘要: 井下配电室监控视频持续时间较长且行为类型复杂,传统双流卷积神经网络(CNN)法对此类行为识别效果较差。针对该问题,对双流CNN法进行改进,提出了一种基于改进双流法的井下配电室巡检行为识别方法。通过场景分析,将巡检行为分为站立检测、下蹲检测、走动、站立记录、坐下记录5种类型,并制作了巡检行为数据集IBDS5。将每个巡检行为视频等分为3个部分,分别对应巡检开始、巡检中和巡检结束;对3个部分视频分别随机采样,获取代表空间特征的RGB图像和代表运动特征的连续光流图像,并分别输入空间流网络和时间流网络进行特征提取;对2个网络的预测特征进行加权融合,获取巡检行为识别结果。实验结果表明,以ResNet152网络结构为基础,且权重比例为1∶2的空间流和时间流双流融合网络具有较高的识别准确度,Top-1准确度达到98.92%;本文方法在IBDS5数据集和公共数据集UCF101上的识别准确率均优于3D-CNN、传统双流CNN等现有方法。

     

    Abstract: The monitoring video of underground power distribution room has a long duration and complex behavior types, and the traditional two-stream convolutional neural network (CNN) has poor recognition effect on such behaviors. In view of the problem, the two-stream CNN method was improved, and a method of inspection behavior recognition of underground power distribution room based on improved two-stream CNN was proposed. Through scene analysis, the inspection behaviors are divided into five types: standing detection, squatting detection, walking, standing record, and sitting down record, and the inspection behavior dataset IBDS5 is produced. Each inspection behavior video is divided into three parts, corresponding to the start of inspection, middle inspection and end of inspection; RGB images representing spatial features and continuous optical flow images representing motion features are obtained by random sample from three parts of the video, and the images are input to spatial flow network and time flow network respectively for feature extraction; weighted fusion of predicted features of the two networks are performed to obtain inspection behavior recognition results. The experimental results show that the spatial-temporal and dual-stream fusion network based on ResNet152 network structure with a weight ratio of 1∶2 has high recognition accuracy, and Top-1 accuracy reaches 98.92%;the recognition accuracy of the proposed method on the IBDS5 dataset and the public dataset UCF101 are better than existing methods such as 3D-CNN and traditional two-stream CNN.

     

/

返回文章
返回