Visible and infrared image fusion algorithm for underground personnel detection
-
摘要: 矿用智能车辆的工作环境光照条件复杂,在进行井下行人检测时可以通过融合可见光和红外图像,将红外线反射信息和细节纹理信息融合于可见光图像中,改善目标检测效果。传统的可见光和红外图像融合方法随着分解层数增多,会导致图像边缘和纹理模糊,同时融合时间也会增加。目前基于深度学习的可见光和红外图像融合方法难以平衡可见光和红外图像中的特征,导致融合图像中细节信息模糊。针对上述问题,提出了一种基于多注意力机制的可见光和红外图像融合算法(IFAM)。首先采用卷积神经网络对可见光和红外图像提取图像特征;然后通过空间注意力和通道注意力模块分别对提取出来的特征进行交叉融合,同时利用特征中梯度信息计算2个注意力模块输出特征的融合权值,根据权值融合2个注意力模块的输出特征;最后通过反卷积变换对图像特征进行还原,得到最终的融合图像。在RoadScene数据集和TNO数据集上的融合结果表明,经IFAM融合后的图像中同时具备了可见光图像中的背景纹理和红外图像中的行人轮廓特征信息;在井下数据集上的融合结果表明,在弱光环境下,红外图像可以弥补可见光的缺点,并且不受环境中其他光源的影响,在弱光条件下融合后的图像中行人轮廓依旧明显。对比分析结果表明,经IFAM融合后图像的信息熵(EN)、标准方差(SD)、梯度融合度量指标(QAB/F)、融合视觉信息保真度(VIFF)和联合结构相似性度量(SSIMu)分别为4.901 3,88.521 4,0.169 3,1.413 5,0.806 2,整体性能优于同类的LLF−IOI,NDM等算法。Abstract: The working environment and lighting conditions of mining intelligent vehicles are complex. When detecting underground personnel, infrared reflection information and detailed texture information can be fused into visible light images by fusing visible and infrared images to improve the target detection effect. Traditional visible and infrared image fusion methods can lead to blurring of image edges and textures as the number of decomposition layers increases, and the fusion time also increases. At present, deep learning based fusion methods for visible and infrared images are difficult to balance the features in visible and infrared images, resulting in blurred detail information in the fused images. In order to solve the above problems, the image fusion algorithm based on multiple attention modules (IFAM) is proposed. Firstly, convolutional neural networks are used to extract image features from visible and infrared images. Secondly, the extracted features are cross fused using spatial attention and channel attention modules. The fusion weights of the output features of the two attention modules are calculated using the gradient information in the features. The output features of the two attention modules are fused based on the weights. Finally, the image features are restored through deconvolution transformation to obtain the final fused image. The fusion results on the RoadScene dataset and TNO dataset indicate that the IFAM fused image contains both background texture information from visible light images and personnel contour feature information from infrared images. The fusion results on the underground dataset indicate that in low lighting environments, infrared images can compensate for the shortcomings of visible light and are not affected by other light sources in the environment. In low lighting conditions, the personnel contour in the fused image is still obvious. The comparative analysis results show that the information entropy (EN), standard deviation (SD), gradient fusion metric (QAB/F), visual information fidelity for fusion (VIFF), and the union structural similarity index measure (SSIMu) of the image after IFAM fusion are 4.901 3, 88.521 4, 0.169 3, 1.413 5, and 0.806 2, respectively. The overall performance is superior to similar algorithms such as LLF-IOI and NDM.
-
表 1 自编码−解码网络配置信息
Table 1. Configuration information of self-encoding and decoding network
模块 网络层 卷积核
大小卷积
步长输入
通道输出
通道激活
函数预处理层 Conv 3 1 1 16 ReLU 编码网络 编码块1 − − 16 64 − 编码块2 − − 64 112 − 编码块3 − − 112 160 − 编码块4 − − 160 208 − 解码网络 解码块1 − − 368 160 − 解码块2 − − 272 112 − 解码块3 − − 384 112 − 解码块4 − − 176 64 − 解码块5 − − 240 64 − 解码块6 − − 304 64 − 后处理层 Conv 1 1 64 1 ReLU 表 2 图像融合算法在井下数据集上的指标数据
Table 2. Index data of image fusion algorithm on underground dataset
算法 SD EN QAB/F VIFF SSIMu LLF−IOI 83.396 2 5.147 2 0.164 7 1.154 5 0.647 5 NDM 69.575 9 5.212 6 0.168 7 1.216 9 0.753 6 PA−PCNN 85.478 9 5.363 4 0.165 4 1.496 3 0.758 6 TA−cGAN 85.446 8 5.442 5 0.203 6 1.425 4 0.723 0 U2fuse 76.093 6 5.502 3 0.123 9 0.832 9 0.473 2 IFAM 88.521 4 4.901 3 0.169 3 1.413 5 0.806 2 表 3 基于多注意力机制的特征融合策略中各模块消融实验结果
Table 3. Experimental results of ablation of each module in feature fusion strategy based on multi attention mechanism
通道
注意力空间
注意力信息保留
度权值EN SD QAB/F VIFF SSIMu √ 5.003 1 75.254 8 0.089 3 0.152 0 0.450 1 √ 5.089 6 70.521 7 0.082 7 0.178 2 0.447 1 √ √ 4.836 9 82.862 7 0.112 9 0.110 4 0.462 4 √ √ √ 5.112 3 83.521 3 0.089 3 0.183 6 0.470 3 表 4 不同 $ \alpha $和 $ \beta $组合下IFAM的实验结果
Table 4. Experimental results of IFAM under different combinations of α and β
$ \alpha $ $ \beta $ EN SD QAB/F VIFF $ {\mathrm{S}\mathrm{S}\mathrm{I}\mathrm{M}}_{\mathrm{u}} $ $ 0.1 $ $ 1 $ 3.309 7 58.659 3 0.063 9 0.814 7 0.438 7 $ 10 $ 3.600 7 60.078 0 0.058 0 0.909 1 0.476 5 $ 100 $ 3.452 6 63.853 6 0.106 5 1.065 6 0.556 3 $ 1\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }000 $ 4.325 5 64.241 7 0.106 4 1.149 8 0.545 6 $ 0.5 $ $ 1 $ 4.063 4 60.933 6 0.013 2 1.234 1 0.596 7 $ 10 $ 4.383 7 63.111 1 0.104 5 1.133 1 0.565 7 $ 100 $ 4.115 9 73.417 1 0.109 9 1.334 1 0.676 7 $ 1\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }000 $ 4.296 1 74.569 5 0.116 4 1.421 5 0.643 2 $ 1 $ $ 1 $ 4.147 5 68.942 3 0.165 8 1.165 7 0.563 4 $ 10 $ 4.308 9 73.922 1 0.145 6 1.134 4 0.624 5 $ 100 $ 4.391 5 76.345 5 0.121 3 1.480 2 0.705 0 $ 1\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }\mathrm{ }000 $ 4.051 7 76.876 9 0.125 6 1.265 2 0.578 5 -
[1] 周李兵. 煤矿井下无轨胶轮车无人驾驶系统研究[J]. 工矿自动化,2022,48(6):36-48.ZHOU Libing. Research on unmanned driving system of underground trackless rubber-tyred vehicle in coal mine[J]. Journal of Mine Automation,2022,48(6):36-48. [2] MA Jiayi,MA Yong,LI Chang. Infrared and visible image fusion methods and applications:a survey[J]. Information Fusion,2019,45:153-178. doi: 10.1016/j.inffus.2018.02.004 [3] WANG Zhishe,XU Jiawei,JIANG Xiaolin,et al. Infrared and visible image fusion via hybrid decomposition of NSCT and morphological sequential toggle operator[J]. Optik,2020,201. DOI: 10.1016/j.ijleo.2019.163497. [4] MALLAT S G. A theory for multiresolution signal decomposition:the wavelet representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1989,11(7):674-693. doi: 10.1109/34.192463 [5] 肖杨,张凌雪,付玲. Contourlet变换及方向滤波器组设计相关问题[J]. 科技风,2009(23):221. doi: 10.3969/j.issn.1671-7341.2009.23.199XIAO Yang,ZHANG Lingxue,FU Ling. Problems related to Contourlet transformation and directional filter bank design[J]. Technology Wind,2009(23):221. doi: 10.3969/j.issn.1671-7341.2009.23.199 [6] DA CUNHA A L,ZHOU Jianping,DO M N. The nonsubsampled contourlet transform:theory,design,and applications[J]. IEEE Transactions on Image Processing,2006,15(10):3089-3101. doi: 10.1109/TIP.2006.877507 [7] 罗娟,王立平. 基于非下采样Contourlet变换耦合特征选择机制的可见光与红外图像融合算法[J]. 电子测量与仪器学报,2021,35(7):163-169.LUO Juan,WANG Liping. Infrared and visible image fusion algorithm based on nonsubsampled contourlet transform coupled with feature selection mechanism[J]. Journal of Electronic Measurement and Instrumentation,2021,35(7):163-169. [8] GAO Ce,QI Donghao,ZHANG Yanchao,et al. Infrared and visible image fusion method based on ResNet in a nonsubsampled contourlet transform domain[J]. IEEE Access,2021,9:91883-91895. doi: 10.1109/ACCESS.2021.3086096 [9] 詹玲超,刘瑾. 基于非下采样Contourlet变换红外和可见光图像的融合方法[J]. 数字技术与应用,2016(10):45-46.ZHAN Lingchao,LIU Jin. Infrared and visible image fusion method based on nonsubsampled contourlet transform[J]. Digital Technology and Application,2016(10):45-46. [10] FARBMAN Z,FATTAL R,LISCHINSKI D,et al. Edge-preserving decompositions for multi-scale tone and detail manipulation[J]. ACM Transactions on Graphics,2008,27(3):1-10. [11] ZHANG Yu,LIU Yu,SUN Peng,et al. IFCNN:a general image fusion framework based on convolutional neural network[J]. Information Fusion,2020,54:99-118. doi: 10.1016/j.inffus.2019.07.011 [12] LI Hui,WU Xiaojun,KITTLER J. RFN-Nest:an end-to-end residual fusion network for infrared and visible images[J]. Information Fusion,2021,73:72-86. doi: 10.1016/j.inffus.2021.02.023 [13] WANG Zhishe,WANG Junyao,WU Yuanyuan,et al. UNFusion:a unified multi-scale densely connected network for infrared and visible image fusion[J]. IEEE Transactions on Circuits and Systems for Video Technology,2022,32(6):3360-3374. doi: 10.1109/TCSVT.2021.3109895 [14] LIU Yu,CHEN Xun,CHENG Juan,et al. Infrared and visible image fusion with convolutional neural networks[J]. International Journal of Wavelets,Multiresolution and Information Processing,2018,16(3). DOI: 10.1142/S0219691318500182. [15] LI Hui,WU Xiaojun. DenseFuse:a fusion approach to infrared and visible images[J]. IEEE Transactions on Image Processing,2018,28(5):2614-2623. [16] 罗迪,王从庆,周勇军. 一种基于生成对抗网络与注意力机制的可见光和红外图像融合方法[J]. 红外技术,2021,43(6):566-574.LUO Di,WANG Congqing,ZHOU Yongjun. A visible and infrared image fusion method based on generative adversarial networks and attention mechanism[J]. Infrared Technology,2021,43(6):566-574. [17] 王志社,邵文禹,杨风暴,等. 红外与可见光图像交互注意力生成对抗融合方法[J]. 光子学报,2022,51(4):318-328.WANG Zhishe,SHAO Wenyu,YANG Fengbao,et al. Infrared and visible image fusion method via interactive attention based generative adversarial network[J]. Acta Photonica Sinica,2022,51(4):318-328. [18] MA Jiayi,ZHANG Hao,SHAO Zhenfeng,et al. GANMcC:a generative adversarial network with multiclassification constraints for infrared and visible image fusion[J]. IEEE Transactions on Instrumentation and Measurement,2021,70:1-14. [19] RAO Dongyu,WU Xiaojun,XU Tianyang. TGFuse:an infrared and visible image fusion approach based on transformer and generative adversarial network[EB/OL]. [2023-06-20]. https://arxiv.org/abs/2201.10147. [20] LI Jing,ZHU Jianming,LI Chang,et al. CGTF:convolution-guided transformer for infrared and visible image fusion[J]. IEEE Transactions on Instrumentation and Measurement,2022,71:1-14. [21] JIANG Yifan,GONG Xinyu,LIU Ding,et al. EnlightenGAN:deep light enhancement without paired supervision[J]. IEEE Transactions on Image Processing,2021,30:2340-2349. doi: 10.1109/TIP.2021.3051462 [22] 秦沛霖,张传伟,周李兵,等. 煤矿井下无人驾驶无轨胶轮车目标3D检测研究[J]. 工矿自动化,2022,48(2):35-41.QIN Peilin,ZHANG Chuanwei,ZHOU Libing,et al. Research on 3D target detection of unmanned trackless rubber-tyred vehicle in coal mine[J]. Industry and Mine Automation,2022,48(2):35-41. [23] WOO S,PARK J,LEE J-Y,et al. CBAM:convolutional block attention module[C]. 15th European Conference on Computer Vision,Munich,2018:3-19. [24] 梁美彦,张倩楠,任竹云,等. 基于注意力机制的结肠癌病理学图像识别研究[J]. 测试技术学报,2022,36(2):93-100.LIANG Meiyan,ZHANG Qiannan,REN Zhuyun,et al. Research on identification of colon pathology image based on attention mechanism[J]. Journal of Test and Measurement Technology,2022,36(2):93-100. [25] 牛悦,王安南,吴胜昔. 基于注意力机制和级联金字塔网络的姿态估计[J/OL]. 华东理工大学学报(自然科学版):1-11[2023-06-20]. DOI: 10.14135/j.cnki.1006-3080.20220715003.NIU Yue,WANG Annan,WU Shengxi. Pose estimation based on attention module and CPN[J/OL]. Journal of East China University of Science and Technology(Natural Science Edition):1-11[2023-06-20]. DOI: 10.14135/j.cnki.1006-3080.20220715003. [26] ZAGORUYKO S,KOMODAKIS N. Paying more attention to attention:improving the performance of convolutional neural networks via attention transfer[EB/OL]. [2023-06-20]. https://arxiv.org/abs/1612.03928v2. [27] 陈舞,孙军梅,李秀梅. 融合多尺度残差和注意力机制的特发性肺纤维化进展预测[J]. 中国图象图形学报,2022,27(3):812-826.CHEN Wu,SUN Junmei,LI Xiumei. Multi-scale residual and attention mechanism fusion based prediction for the progression of idiopathic pulmonary fibrosis[J]. Journal of Image and Graphics,2022,27(3):812-826. [28] 李国梁,向文豪,张顺利,等. 基于残差网络和注意力机制的红外与可见光图像融合算法[J]. 无人系统技术,2022,5(2):9-21.LI Guoliang,XIANG Wenhao,ZHANG Shunli,et al. Infrared and visible image fusion algorithm based on residual network and attention mechanism[J]. Unmanned Systems Technology,2022,5(2):9-21. [29] XU Han,MA Jiayi,JIANG Junjun,et al. U2Fusion:a unified unsupervised image fusion network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(1):502-518. [30] LIN T-Y,MAIRE M,BELONGIE S,et al. Microsoft COCO:common objects incontext[C]. 13th European Conference on Computer Vision,Zurich,2014:740-755. [31] ROBERTS J W,VAN AARDT J A,AHMED F B. Assessment of image fusion procedures using entropy,image quality,and multispectral classification[J]. Journal of Applied Remote Sensing,2008,2(1). DOI: 10.1117/1.2945910. [32] XYDEAS C,PETROVIC V. Objective image fusion performance measure[J]. Electronics Letters,2000,36(4):308-309. doi: 10.1049/el:20000267 [33] HAN Yu,CAI Yunze,CAO Yin,et al. A new image fusion performance metric based on visual information fidelity[J]. Information Fusion,2013,14(2):127-135. doi: 10.1016/j.inffus.2011.08.002 [34] 吴明辉. 联合特征提取方法的图像融合技术研究[D]. 武汉:武汉大学,2021.WU Minghui. Research on image fusion based on joint feature extraction[D]. Wuhan:Wuhan University,2021. [35] XU Han. RoadScene:a new dataset of aligned infrared and visible images[DB/OL]. [2023-06-07]. https://github.com/hanna-xu/RoadScene. [36] TOET A. TNO image fusion datase[DB/OL]. [2023-06-26]. https://figshare.com/articles/dataset/TNO_Image_Fusion_Dataset/1008029. [37] DU Jiao,LI Weisheng,XIAO Bin. Anatomical-functional image fusion by information of interest in local Laplacian filtering domain[J]. IEEE Transactions on Image Processing,2017,26(12):5855-5866. doi: 10.1109/TIP.2017.2745202 [38] LIU Zhe,SONG Yuqing,SHENG V S,et al. MRI and PET image fusion using the nonparametric density model and the theory of variable-weight[J]. Computer Methods and Programs in Biomedicine,2019,175:73-82. doi: 10.1016/j.cmpb.2019.04.010 [39] YIN Ming,LIU Xiaoning,LIU Yu,et al. Medical image fusion with parameter-adaptive pulse coupled neural network in nonsubsampled shearlet transform domain[J]. IEEE Transactions on Instrumentation and Measurement,2018,68(1):49-64. [40] KANG Jiayin,LU Wu,ZHANG Wenjuan. Fusion of brain PET and MRI images using tissue-aware conditional generative adversarial network with joint loss[J]. IEEE Access,2020,8:6368-6378. doi: 10.1109/ACCESS.2019.2963741