2023 Vol. 49, No. 11

Display Method:
Overview of key technologies for mine-wide intelligent video analysis
CHENG Deqiang, KOU Qiqi, JIANG He, XU Feixiang, SONG Tianshu, WANG Xiaoyi, QIAN Jiansheng
2023, 49(11): 1-21. doi: 10.13272/j.issn.1671-251x.18165
<Abstract>(1897) <HTML> (106) <PDF>(289)
Abstract:
Intelligence is the direction of coal mine development, and intelligent video analysis is an effective way to promote coal mine intelligence. The mine-wide intelligent video analysis technology has real-time monitoring, early warning, and decision support capabilities. It helps to improve the safety, production efficiency, resource utilization efficiency, and environmental sustainability of mining enterprises. The key technologies of mine-wide intelligent video analysis are introduced in detail, including video acquisition and processing technologies such as video acquisition equipment, video pre-processing, video compression and coding, basic video analysis technologies such as object detection and tracking, motion detection and analysis, object recognition and classification, and advanced video analysis technologies such as behaviour recognition and analysis, event detection and alarm, video monitoring and arming. A mining intelligent AI visual intelligence service platform that integrates video recognition analysis and industrial linkage control functions is developed. The paper introduces the application of intelligent video analysis technology in mining production scenarios such as intelligent water and gas exploration and discharge systems, coal rock recognition and cutting systems, heading working faces, fully mechanized working faces, coal flow transportation systems, mine hoist systems, auxiliary transportation systems, coal preparation plants, and intelligent loading and coal blending systems. The analysis points out that the current mine-wide intelligent video analysis technology still faces challenges in terms of video quality, complex backgrounds, real-time requirements, data privacy and security, system reliability and stability, etc. It is suggested to strengthen the research on algorithm improvement and optimization, multimodal data fusion, real-time analysis and edge computing, enhanced learning and independent decision-making, data privacy and security protection, hardware equipment and sensor technology in the future. Therefore, the development of mine-wide intelligent video analysis technology is comprehensively promoted and promote the process of mine intelligence is promoted.
Research status and development trend of visual processing technology for fully mechanized excavation systems
DU Yuxin, ZHANG He, WANG Shuchen, ZHANG Jianhua
2023, 49(11): 22-38, 75. doi: 10.13272/j.issn.1671-251x.2023090042
<Abstract>(1348) <HTML> (43) <PDF>(97)
Abstract:
Machine vision technology has the advantages of non-contact measurement, large amount of information acquisition, and strong data processing capability. Applying it to fully mechanized excavation faces is of great significance for improving the efficiency of fully mechanized excavation work, ensuring the safety of personnel and equipment, and reducing accidents. This article summarizes the specific application and development of visual processing technology in coal mine fully mechanized excavation systems in recent years. Based on the task division of fully mechanized excavation working faces and combined with specific practical cases, this paper focuses on the analysis of the application of machine vision technology in visual inspection and positioning, safety monitoring and accident prevention, and equipment automation and intelligence. By analyzing the structures and detection principles of various visual detection systems in different application scenarios, the technical performance, workflow, and advantages and disadvantages of visual processing technology in the application of fully mechanized excavation face engineering are clarified. This study analyzes the challenges of visual technology in the application of fully mechanized excavation face, including environmental adaptability issues, narrow imaging field of view, and the need to improve the robustness and reliability of intelligent algorithms. It is pointed out that multi-sensor information fusion technology, equipment group cooperative control technology and digital twin-driven remote monitoring technology are the new directions that need to be focused on in the future development of the intelligent equipment system of coal mine based on machine vision.
Research on multi object detection in mining face based on FBEC-YOLOv5s
ZHANG Hui, SU Guoyong, ZHAO Dongyang
2023, 49(11): 39-45. doi: 10.13272/j.issn.1671-251x.2023060063
<Abstract>(254) <HTML> (62) <PDF>(77)
Abstract:
A multi object detection algorithm based on FBEC-YOLOv5s is proposed to address the issues of reduced detection precision caused by large object scale spans, severe obstruction between multiple objects, and harsh environments in mining faces. Firstly, the FasterNet network is introduced into the backbone network to enhance the model's feature extraction and semantic information capture capabilities through its residual connection and batch standardization module. Secondly, the BiFPN network is fused in the neck of the YOLOv5s model to achieve rapid capture and fusion of multi-scale features through its bidirectional cross scale connection and fast normalization fusion operation. Finally, the ECIoU loss function is used instead of the CIoU loss function to improve the positioning precision of the detection frame and the convergence speed of the model. The experimental results show the following points. ① While meeting the real-time detection requirements of coal mines, the precision of the FBEC-YOLOv5s model has increased by 3.6% compared to YOLOv5s model. ② Compared with the YOLOv5s model, the average detection precision of the FBEC-YOLOv5s model has increased by 2.8%, with an average detection precision of 92.4%, which can meet real-time detection requirements. ③ The FBEC-YOLOv5s model has good comprehensive detection performance, demonstrating good real-time detection capability and robustness in condition that detection accuracy is reduced caused by harsh environments, severe mutual obstruction between multiple objects, and large object scale spans.
Real time detection of foreign objects in belt conveyors based on Faster-YOLOv7
TANG Jun, LI Jingzhao, SHI Qing, YANG Ping, WANG Rui
2023, 49(11): 46-52, 66. doi: 10.13272/j.issn.1671-251x.2023020037
<Abstract>(401) <HTML> (62) <PDF>(99)
Abstract:
The object detection algorithm based on deep learning has good recognition performance in foreign object detection. But the model memory requirement is large and the detection speed is slow. The lightweight deep learning networks can significantly reduce model memory requirements and improve detection speed. But their detection precision is low in weak light environments underground. In order to solve the above problems, a real-time foreign object detection algorithm for belt conveyors based on Faster-YOLOv7 is proposed. By using the contrast limited adaptive histogram equalization (CLAHE) with limited contrast for image enhancement, the contrast of foreign objects in low light environments is improved. Lightweight design of the YOLOv7 backbone network based on Mobilenetv3 is carried out to reduce the computational and parameter load of the YOLOv7 model. By adding an effective channel attention mechanism, the method alleviates the problem of high-level feature information loss caused by a decrease in the number of feature channels. Alpha-IoU is used as the loss function to improve the precision of foreign object detection. The experimental results show the following points. ① The initial loss of Faster-YOLOv7 is 0.143, and the final stability is around 0.039. ② The detection speed of Faster-YOLOv7 can reach 42 frames/s, which is 17 and 20 frames/s higher than YOLOv5 and YOLOv7, respectively. Faster-YOLOv7 has a memory of 14 MiB, which is 29 and 57 MiB lower than YOLOv5 and YOLOv7, respectively. The detection accuracy reaches 91.3%, which is 8.8% higher than YOLOv5. ③Applying SSD, YOLOv5, lightweight YOLOv7, and Faster-YOLOv7 object detection algorithms to the coal conveying images and videos of underground belt conveyors in coal mines, it is found that SSD misses detection during video detection. YOLO series models effectively recognized the foreign objects to be tested, and Faster-YOLOv7 recognition results has a higher confidence level.
Foreign object detection method for belt conveyor based on generative adversarial nets
ZHANG Liya
2023, 49(11): 53-59. doi: 10.13272/j.issn.1671-251x.2023080046
<Abstract>(1231) <HTML> (102) <PDF>(68)
Abstract:
The images of coal mine underground belt transportation have the features of low illumination, unclear details, and background interference. The existing foreign object detection models for belt conveyors have problems such as low precision, poor flexibility, large computational complexity, and differences in optimization space. In order to solve the above problems, a foreign object detection method for belt conveyors based on generative adversarial nets (GAN) is proposed. The method preprocesses the video files of the tape transportation process, classifies them into normal and abnormal images. The method creats an experimental dataset to train the improved GANomaly model, and then uses the trained model to detect foreign objects in the belt conveyor. During the training phase, the image of the belt conveyor without foreign objects is used as input. In the testing phase, the image of the belt conveyor containing foreign objects is used as input. The reconstructed image obtained is subtracted from the original image of the input network to obtain the specific position of the foreign object. The lightweight improvement method of GANomaly model adds a depthwise separable convolution residual module to the GANomaly basic network model, and uses depthwise separable convolution to replace the convolution operation in the original backbone network. It greatly reduces the computational complexity of the model and reduces the redundant calculation of parameters, which can significantly improve the speed of foreign object detection. By merging multiple batch normalization (BN) layers, the convergence iteration speed of the model is accelerated, the generalization convergence capability of the model is improved, and gradient vanishing is effectively avoided. The experimental results show that the improved GANomaly model has improved the running speed by 6.27% compared to the traditional GANomaly model. The evaluation indicators F1 score, AUC, Recall, and mean average precision (mAP) have increased by 19.05%, 22.22%, 15.00%, and 17.14%, respectively.
Method for recognizing coal flow status of scraper conveyor in working face
WU Jiangwei, NAN Bingfei
2023, 49(11): 60-66. doi: 10.13272/j.issn.1671-251x.2023080101
<Abstract>(212) <HTML> (42) <PDF>(55)
Abstract:
The various poses of scraper conveyors, irregular coal material shapes, limited equipment installation positions, high dust, and foreign object obstruction in the scene of scraper conveyors in underground coal mines have led to the inability of existing coal flow status recognition methods for belt conveyor scenarios to be applied in engineering. In order to solve the above problems, a method for recognizing the coal flow status of a scraper conveyor in a working face based on temporal visual features is proposed. This method first utilizes the DeepLabV3+semantic segmentation model to obtain rough coal flow regions in the coal flow video image of the working face. Then the method uses linear fitting method to locate and segment fine coal flow regions, achieving coal flow image extraction. Then the method arranges the coal flow images in video sequence to form a sequence of coal flow images. Finally, a convolutional 3D (C3D) action recognition model is used to model the features of coal flow image sequences and achieve automatic recognition of coal flow status. The experimental results show that this method can accurately obtain coal flow images and automatically and real-time recognize coal flow status, with an average recognition accuracy of 92.73% for coal flow status. For engineering deployment applications, TensorRT is used to accelerate model processing. For the coal flow video image with a resolution of 1 280×720, the overall processing speed is 42.7 frames/s, which meets the actual demand for intelligent monitoring of coal flow status at the working face.
Bimodal environment perception technology for underground coal mine based on radar and visual fusion
YANG Zhifang
2023, 49(11): 67-75. doi: 10.13272/j.issn.1671-251x.2023080073
<Abstract>(307) <HTML> (96) <PDF>(62)
Abstract:
Environmental perception is a key technology for scenario applications such as coal mine inspection robots and visual measurement systems. The single modal environmental perception technology has poor perception capability for complex environments in underground coal mines. A bimodal space fusion method for radar and vision has been proposed. The modal achieves the fusion of information collected by LiDAR and camera through coordinate conversion, thereby improving environmental perception capability. In order to better extract object feature information, a bimodal fusion environment perception network architecture technology route is proposed. The environmental information collected by the camera and radar is fused and processed by the radar and visual bimodal space fusion method. The multimodal feature fusion network module extracts object features from the fused information. The multitask processing network module uses different task heads to process object feature information, completing environmental perception tasks such as object detection, image segmentation, and object classification. The experiment is conducted using the YOLOv5s object detection algorithm to build a bimodal feature extraction network module. The results show that the success rate of the bimodal environment perception technology for underground coal mine based on radar and visual fusion for personnel detection in underground roadway environments is improved by 15% and 10% compared to visual and radar perception, respectively. The mean average precision of segmentation for various types of objects such as lane lines and signs are improved by more than 10% compared to visual perception. It effectively improves the perception capability of underground environment in coal mines, providing technical support for application scenarios such as coal mine road environment perception, visual measurement systems, unmanned mining vehicle navigation systems, and mine search and rescue robots.
Research on super-resolution reconstruction of mine images
WANG Yuanbin, LIU Jia, GUO Yaru, WU Bingchao
2023, 49(11): 76-83, 120. doi: 10.13272/j.issn.1671-251x.2023080081
<Abstract>(224) <HTML> (58) <PDF>(60)
Abstract:
Due to the impact of high dust and low illumination in underground environments, mine images have problems such as low resolution and blurry details. When existing image super-resolution reconstruction algorithms are applied to mine images, it is difficult to obtain image information at different scales. The network parameters are too large, which affects the reconstruction speed. The reconstructed images are prone to problems such as detail loss, blurry edge contours, and artifacts. A mine image super-resolution reconstruction algorithm based on multi-scale dense channel attention super-resolution generative adversarial network (SRGAN) is proposed. A multi-scale dense channel attention residual block is designed to replace the original residual block of SRGAN. Two parallel dense connected blocks with different convolutional kernel sizes are used to fully obtain image features. The efficient channel attention modules are integrated to enhance attention to high-frequency information. The depthwise separable convolution is used to lighten the network and suppress the increase of network parameters. The texture loss constraint network training is utilized to avoid artifacts during network deepening. Experiments are conducted on the proposed mine image super-resolution reconstruction algorithm and classic super-resolution reconstruction algorithms BICUBIC, SRCNN, SRRESNET, SRGAN on both underground and public datasets. The results show that the proposed algorithm outperformed the comparative algorithm in both subjective and objective evaluations. Compared to SRGAN, the proposed algorithm reduces network parameters by 2.54%. Compared to the average index values of the classic algorithms, the peak signal-to-noise ratio and structural similarity of the proposed algorithm increase by 0.764 dB and 0.053 58 respectively. It can better focus on the texture, contour and other details of the image, and the reconstructed image is more in line with human vision.
Super-resolution reconstruction of rock CT images based on Real-ESRGAN
LI Gang, ZHANG Yabing, YANG Qinghe, ZOU Junpeng, CAI Tian, LIU Hang, ZHAO Yiming
2023, 49(11): 84-91. doi: 10.13272/j.issn.1671-251x.2023080093
<Abstract>(1389) <HTML> (64) <PDF>(47)
Abstract:
Due to factors such as image acquisition equipment and geological environment, rock CT images have low resolution and unclear details. However, existing image super-resolution reconstruction methods are prone to losing details when characterizing high-density mineral particles and pores and cracks inside. To solve the above problems, an improved enhanced super-resolution generative adversarial network (Real-ESRGAN) is used for super-resolution reconstruction of rock CT images. The sandstone of the 15th coal seam floor in Zhaozhuang Coal Mine, Shanxi Jincheng Anthracite Mining Group Co., Ltd. is selected as the research object to study the reconstruction performance of Real-ESRGAN under different image magnifications. It is compared with algorithms such as super-resolution convolutional neural network (SRCNN), super-resolution generative adversarial network (SRGAN), enhanced super-resolution generative adversarial network (ESRGAN), and enhanced deep super-resolution network (EDSR). The experimental results show the following points. ① The high-resolution images reconstructed using Real-ESRGAN have clearer visual effects than the original CT images. The contours of cracks and high-density mineral particles in the reconstructed images are more prominent, greatly improving the visibility of the images. ② In terms of objective evaluation, the Real-ESRGAN algorithm achieves a peak signal-to-noise ratio (PSNR) of 36.880 dB and a structural similarity (SSIM) of 0.933 in the image after 2x super-resolution reconstruction. But as the magnification increases, the pores on the 6x super-resolution reconstructed image become blurry, with PSNR decreasing to 32.781 dB and SSIM reaching 0.896. ③ The porosity and throat length distribution ratio of the Real-ESRGAN reconstructed super-resolution image are very close to the original CT image, preserving important microstructural information of the rock.
Image enhancement algorithm for non-uniform illumination in underground mines
MIAO Zuohua, ZHAO Chengcheng, ZHU Liangjian, LIU Daiwen, CHEN Aoguang
2023, 49(11): 92-99. doi: 10.13272/j.issn.1671-251x.2023060032
<Abstract>(181) <HTML> (96) <PDF>(49)
Abstract:
Due to the non-uniform distribution of lighting systems and the presence of a large amount of dust and mist in the environment during the underground video collection process, there are problems with local light overexposure, insufficient brightness, low contrast, and weak edge information in the monitoring image. In order to solve the above problems, an image enhancement algorithm for non-uniform illumination in underground mines is proposed. This algorithm is based on the improvement of Retinex-Net network structure, which includes three parts: non-uniform illumination suppression module (NLSM), illumination decomposition module (LDM), and image enhancement module (IEM). Among them, NLSM suppresses local non-uniform illumination of artificial light sources in the image. LDM decomposes the image into light and reflection layers. IEM enhances the illumination layer of the image, undergoes gamma correction, and ultimately obtains the enhanced image. Resnet is adopted as the infrastructure of the network in both NLSM and LDM. The channel attention module and spatial attention module in the convolutional attention mechanism are sequentially introduced to enhance the attention to image lighting features and the efficiency of feature selection. The experimental results show the following points. ① MBLLEN, RUAS, zeroDCE, zeroDCE++, Retinex−Net, KinD++, and non-uniform illumination image enhancement algorithms are selected to enhance and qualitatively analyze images in various scenarios (underground transportation environment, single light source roadway, multi light source roadway, ore scenario). The analysis results indicate that non-uniform illumination image enhancement algorithms can avoid excessive enhancement of artificial light source areas. There is no halo or blurring phenomenon in the light source area, and colors are not prone to color deviation. The contrast is moderate, and the visual effect of the image is more realistic. ② The information entropy (IE), average gradient (AG), standard deviation (SD), naturalness image quality evaluator (NIQE), structural similarity (SSIM), and peak signal-to-noise ratio (PSNR) are selected as evaluation indicators to quantitatively compare the quality of image enhancement images. The non-uniform illumination image enhancement algorithm is also in a relatively leading position in various scenarios. ③ The ablation experimental results show the non-uniform illumination image enhancement algorithm achieves optimal results on three evaluation indicators: NIQE, SSIM, and PSNR.
Coal mine underground image enhancement method based on dust removal estimation and multiple exposure fusion
HAO Bonan
2023, 49(11): 100-106. doi: 10.13272/j.issn.1671-251x.2023080105
<Abstract>(171) <HTML> (24) <PDF>(45)
Abstract:
Factors such as dust and dim light in coal mines lead to low quality of collected images. The existing image enhancement methods have problems such as loss of image details, unclear local features, inability to eliminate noise, and unsatisfactory dust removal effects. In order to solve the above problems, a coal mine underground image enhancement method based on dust removal estimation and multiple exposure fusion is proposed. This method uses a simplified model of dust image and dark primary color theory, and introduces an adaptive attenuation coefficient to estimate the image transmittance. Based on the transmittance distribution, the original image of the object is restored using the simplified model of dust image to remove dust from the coal mine underground image. The method uses a multiple exposure fusion algorithm to generate a set of images with different exposure ratios for underexposed original images, and introduces a weight matrix to fuse these images with the original image, effectively improving the quality of dim light images. The experimental results show that compared to the histogram equalization method, the multiple-scale Retinex with color restoration method (MSRCR), and the improved Retinex method, this method has better results in dust removal and dim light enhancement, with higher color restoration, suppressed white edges and overexposure. The average contrast of the enhanced images has increased by 62.78%, 29.82%, 9.8%, and the average image entropy has increased by 34.13%, 14.12%, and 8.25%, respectively. The average lightness order error (LOE) has been reduced by 40.9%, 20.39%, and 8.5%, respectively. This method has the shortest computational time.
Real time segmentation method for underground track area based on improved STDC
MA Tian, LI Fanhui, YANG Jiayi, ZHANG Jiehui, DING Xuhan
2023, 49(11): 107-114. doi: 10.13272/j.issn.1671-251x.2023080076
<Abstract>(176) <HTML> (68) <PDF>(22)
Abstract:
Currently, most underground rail transportation scenarios in China are relatively open. There are problems of operators, scattered materials, or coal slag invading the track. It poses a threat to locomotive operation. The underground track area of coal mines often presents linear or arc-shaped irregular areas, and the track gradually converges. It is difficult to accurately obtain the track range by using object recognition boxes or detecting track lines to divide the track area. Using track area segmentation can achieve pixel level accurate track area detection. Aiming at the problems of poor edge information segmentation and low real-time performance in current underground track area segmentation methods, a real-time track area segmentation method based on improved network short-term dense concatenate (STDC) is proposed. STDC is adopted as the backbone architecture to reduce the amount of network parameters and computational complexity. A feature attention module (FAM) based on channel attention mechanism is designed to capture the dependency relationships between channels and effectively refine and combine features. The feature fusion module (FFM) is used to fuse advanced semantic features with shallow features. The channel and spatial attention are utilized to enrich the fusion feature expression, effectively obtaining features and reducing feature information loss, improving model performance. Binary cross entropy loss, dice loss, and image quality loss are used to optimize the extraction of detailed information, and to improve segmentation efficiency by eliminating redundant structures. By verifying on a self built dataset, the results show the following points. The mean intersection over union (MIoU) of the improved STDC based real-time segmentation method for track area is 95.88, which is 3% higher than STDC. The number of parameters is 6.74 MiB, which is 18.3% lower than STDC. As the number of iterations increases, the optimized loss function value continues to decrease, and the decrease in function value is more significant than that of the original model. The MIoU of the improved STDC based real-time segmentation method for track area reaches 95.88%, frames per second is 37.8 frames/s, the number of parameters is 6.74 MiB, and accuray rate is 99.46%. This method can fully recognize the track area, accurately segment the track, and provide complete and accurate edge contours.
Coal mine image instance segmentation method based on improved SOLOv2
JI Liang
2023, 49(11): 115-120. doi: 10.13272/j.issn.1671-251x.2023030017
<Abstract>(243) <HTML> (75) <PDF>(27)
Abstract:
The existing image segmentation methods have good results when used for coal mine underground images with good clarity. But when the methods are applied to coal mine underground images with complex environments, the obtained images are mostly blurry and the contour of the target object is not clear. The result affects the segmentation precision of the target object. In order to solve the above problems, a coal mine image instance segmentation method based on improved SOLOv2 is proposed. The method replaces the ResNet-50 network of the SOLOv2 model with the ResNeXt-18 network to simplify the network layers and improve the inference speed of the model. The method introduces the coordinate attention (CA) module to enhance the model's feature extraction capability, retain precise positional information, and improve the model's image segmentation precision. The method replaces the ReLU activation function with the ACON-C activation function. The features between neurons can be fully combined, enhancing the model's feature expression capability, and further improving the image segmentation precision of the model. The improved SOLOv2 model is deployed on an embedded platform for coal mine image segmentation experiments. Compared to the SOLOv2 model, the Mask AP (mask average precision) of the improved SOLOv2 model increases by 1.1%, the weight file of the model decreases by 83.2 MiB. The inference speed increases by 5.30 frames/s, reaching 26.10 frames/s. Both the precision and inference speed of coal mine image segmentation are improved to a certain extent.
Multi object detection of underground unmanned electric locomotives in coal mines based on SD-YOLOv5s-4L
ZHAO Wei, WANG Shuang, ZHAO Dongyang
2023, 49(11): 121-128. doi: 10.13272/j.issn.1671-251x.2023070100
<Abstract>(1260) <HTML> (28) <PDF>(56)
Abstract:
Due to complex environmental factors such as uneven illumination and high noise, unmanned electric locomotives in coal mines have low accuracy in multi object detection and difficulty in recognizing small objects. In order to solve the above problems, a multi object detection model for underground unmanned electric locomotives in coal mines based on SD-YOLOv5s-4L is proposed. On the basis of YOLOv5s, the following improvements are made to construct the SD-YOLOv5s-4L network model. The model introduces the SIoU loss function to solve the problem of mismatch between the direction of the real box and the predicted box, so that the model can better learn the position information of the object. The model introduces decoupled heads at the head of YOLOv5s to enhance the feature fusion and positioning accuracy of the network model. It enables the model to quickly capture multi-scale features of the object. The model introduces a small object detection layer to increase the original three scale detection layer to four scale. It enhances the model's feature extraction capability and detection precision for small objects. The experiment is conducted on a multi object detection dataset of the mine electric locomotives. The results show the following points. The mean average precision (mAP) of the SD-YOLOv5s-4L network model for various types of objects is 97.9%, and the average precision (AP) for small objects is 98.9%. Compared with the YOLOv5s network model, it improves by 5.2% and 9.8%, respectively. Compared with other network models such as YOLOv7 and YOLOv8, the SD-YOLOv5s-4L network model has the best comprehensive detection performance and can provide technical support for achieving unmanned driving of the mine electric locomotives.
Multi object personnel detection and dynamic tracking method based on improved KCF
LIU Yi, PANG Dawei, TIAN Yu
2023, 49(11): 129-137. doi: 10.13272/j.issn.1671-251x.2023080015
<Abstract>(968) <HTML> (55) <PDF>(44)
Abstract:
Factors such as insufficient illumination in coal mine roadways, drastic changes in object scale, easy obstruction of objects, and interference from mining lights lead to low success rate and accuracy in underground object detection and tracking. In order to solve the above problems, a multi object personnel detection and dynamic tracking method based on improved kernel correlation filter (KCF) algorithm is proposed. The method can avoid detection failure due to uneven lighting in complex underground environments. The SSD detection algorithm is introduced into the improved KCF algorithm to enhance the capability to detect multiple object personnel. ① The method reads the video sequence to be tracked, uses the SSD algorithm trained on the underground dataset to detect the object in the image. The method continues reading the next frame if no object is found. ② The method places the detected object into the tracker, preprocesses the image, scores all detection boxes according to the set threshold through comparison, and arranges them in descending order based on the score. The high score detection results are directly output, while the low score detection results are used to filter out bad information to improve detection speed. ③ The method clears the tracker after tracking and predicting object M frames through KCF, and then performs object detection again. By combining detection and tracking algorithms, the continuous tracking capability of the object is ensured. The experimental results show the following points. ① The final loss value of this method is stable around 1.675, and the detection results are relatively stable. ② The SSD recognition precision after training has improved by 52.7% compared to the SSD recognition precision before training. ③ The detection success rate and tracking accuracy of this method for mine personnel are 87.9% and 88.9%, respectively, which are higher than the detection success rate and tracking accuracy of the other four algorithms (KCF, CSRT, TLD, MIL). ④ This method has a high success rate when the overlap threshold is low, and until the overlap threshold is greater than 0.8, the success rate significantly decreases. This is because the environment in the mine is diverse, and it is difficult to fully match the labeled boxes. The practical application results show that this method has high applicability in complex environments such as insufficient lighting in underground coal mine roadways, drastic changes in object scale, easy obstruction, and interference from mining lights.
Recognition of unsafe behaviors of underground personnel based on multi modal feature fusion
WANG Yu, YU Chunhua, CHEN Xiaoqing, SONG Jiawei
2023, 49(11): 138-144. doi: 10.13272/j.issn.1671-251x.2023070055
<Abstract>(1594) <HTML> (147) <PDF>(86)
Abstract:
The use of artificial intelligence technology for real-time recognition of underground personnel's behavior is of great significance for ensuring safe production in mines. The RGB modal based behavior recognition methods is susceptible to video image background noise. The bone modal based behavior recognition methods lacks visual feature information of humans and objects. In order to solve the above problems, a multi modal feature fusion based underground personnel unsafe behavior recognition method is proposed by combining the two methods. The SlowOnly network is used to extract RGB modal features. The YOLOX and Lite HRNet networks are used to obtain bone modal data. The PoseC3D network is used to extract bone modal features. The early and late fusion of RGB modal features and bone modal features are performed. The recognition results for unsafe behavior of underground personnel are finally obtained. The experimental results on the NTU60 RGB+D public dataset under the X-Sub standard show the following points. In the behavior recognition model based on a single bone modal, PoseC3D has a higher recognition accuracy than GCN (graph convolutional network) methods, reaching 93.1%. The behavior recognition model based on multimodal feature fusion has a higher recognition accuracy than the recognition model based on a single bone modal, reaching 95.4%. The experimental results on a self-made underground unsafe behavior dataset show that the behavior recognition model based on multimodal feature fusion still has the highest recognition accuracy in complex underground environments, reaching 93.3%. It can accurately recognize similar unsafe behaviors and multiple unsafe behaviors.
Deep learning-based face detection method under low illumination conditions in coal mines
WANG Junli, LI Jiayue, LI Bingtian, WEN Qi, WANG Manli
2023, 49(11): 145-150. doi: 10.13272/j.issn.1671-251x.2023080103
<Abstract>(130) <HTML> (93) <PDF>(32)
Abstract:
The low contrast and blurry facial features of facial images collected by the monitoring system are caused by dim illumination and interference from artificial illumination sources in coal mines. Traditional facial detection algorithms may cause false or missed detections when applied in coal mines. In order to solve the above problems, a deep learning-based face detection method under low illumination conditions in coal mines is proposed. A generative adversarial network (GAN) based on unsupervised learning is used to enhance the contrast of low illumination images in coal mines. A self-adjusting attention guided U-Net is used as the generator, and dual discriminators are used to guide global and local information. The self-feature retention loss function is used to guide the training process and maintain the texture structure of the face in the image and strengthen facial features. It can avoid phenomena such as exposure and loss of facial detail information, and obtain clearer facial images. The RetinaFace face detection framework is used to detect the enhanced facial features. It uses a feature pyramid structure and a single stage detection mode to detect facial images. It improves the capability to detect small-scale faces without increasing computational complexity. The experimental results on the public low illumination face dataset DARK FACE and the self built coal mine underground face dataset show that this method improves image contrast, clearly restores facial features in the image, and performs well in accuracy, recall, and average accuracy, effectively improving the accuracy of coal mine underground face detection.
Lightweight safety helmet wearing detection fusing coordinate attention and multiscale feature
LI Zhongfei, FENG Shiyong, GUO Jun, ZHANG Yunhe, XU Feixiang
2023, 49(11): 151-159. doi: 10.13272/j.issn.1671-251x.2023080123
<Abstract>(165) <HTML> (60) <PDF>(32)
Abstract:
The existing algorithm for detecting the helmet wear by coal miners has the problem of difficulty in achieving a good balance between detection accuracy and speed. In order to solve the above problem, based on the YOLOv4 model, a lightweight model (M-YOLO) that integrates coordinate attention and multi-scale is proposed and applied in safety helmet wearing detection. This model replaces YOLOv4's feature extraction network CSPDarknet53 with a lightweight feature extraction network S-MobileNetV2 composed of a mixed coordinate attention module. It effectively improves the connection between features while reducing the number of related parameters. The model changes the parallel connection method in the original spatial pyramid pooling structure to serial connection. It effectively improves computational efficiency. The feature fusion network is improved by introducing shallow features with high-resolution and multi detail texture information. It effectively enhances the extraction of object features. Some convolutions in the original Neck structure are modified to deep separable convolutions, further reducing the model's parameter and computational complexity while ensuring detection precision. The experimental results show that compared with the YOLOv4 model, the mean average precision of the M-YOLO model is only reduced by 0.84%. But the computational complexity, parameter quantity, and model size are reduced by 74.5%, 72.8%, and 81.6%, respectively. The detection speed is improved by 53.4%. Compared to other models, the M-YOLO model achieves a good balance between accuracy and real-time performance, meeting the requirements of embedded loading and deployment on intelligent video surveillance terminals.
A miner queue detection method based on improved YOLOv5s
HAO Mingyue, MIN Bingbing, ZHANG Xinjian, ZHAO Zuopeng, WU Chen, WANG Xin
2023, 49(11): 160-166. doi: 10.13272/j.issn.1671-251x.2023030058
<Abstract>(163) <HTML> (81) <PDF>(31)
Abstract:
Traditional object detection algorithms require manual feature extraction when recognizing abnormal behavior of miners queuing, resulting in long detection time and low detection precision. The object detection algorithm based on convolutional neural networks has improved detection speed and precision. But its detection performance is difficult to guarantee in scenarios of obstruction, dimness, and uneven illumination. In order to solve the above problems, an improved YOLOv5s (HPI YOLOv5s) model is proposed. It is used for miner queue detection. The HPI-YOLOv5s model improves the path aggregation network (PANet) on the basis of the YOLOv5s model. By deleting a single input edge node and adding bidirectional crossing paths, a bidirectional cross feature pyramid network (BCrFPN) is constructed for multi-scale feature fusion. Considering the low robustness of label allocation strategies with manually set thresholds, a dynamic label allocation strategy (ATSS-PLUS) is proposed based on adaptive training sample selection (ATSS) to dynamically set thresholds. It can reasonably evaluate the quality of candidate samples and dynamically set thresholds for each real object, resulting in higher detection precision and robustness. The method calculates the intersection area between the face frame and the designated queue area using the half plane intersection method. The method compares the ratio of the intersection area to the face frame area with the set threshold to determine whether the miners are queuing in an orderly manner. The experimental results show that the HPI-YOLOv5s model has an accuracy improvement of 1.9%, a weight reduction of 32%, a parameter reduction of 6.9%, and a detection speed improvement of 7.8% compared to the YOLOv5s model. Moreover, it can more accurately recognize the queuing situation of miners in obstruction, dimness, and uneven illumination mine images.
Remote supervision and management method for coal mine gas extraction drilling site based on AI video analysis
HU Jincheng, ZHANG Libin, JIANG Ze, YAO Chaoxiu, JIANG Zhilong, WANG Zhengyi
2023, 49(11): 167-172. doi: 10.13272/j.issn.1671-251x.2023080031
<Abstract>(1126) <HTML> (142) <PDF>(68)
Abstract:
The traditional video monitoring system for coal mine gas extraction drilling site only has monitoring and storage functions during drilling construction and drill pipe withdrawal. Important process parameters or information can only be viewed by monitoring personnel through video recordings, which poses problems such as construction information being prone to errors and difficulty for drilling site management personnel to continuously monitor on-site videos. It order to solve the above problems, A remote supervision and management method for coal mine gas extraction drilling sites based on AI video analysis has been proposed. This method includes three algorithms: information board detection, OCR recognition, and drill pipe withdrawal analysis. Information board detection is used to detect the current construction phase. PaddleOCR recognition is used to recognize the drilling process and construction information on the information board. The drill pipe withdrawal analysis is used to analyze the number of drill pipes withdrawn during the closing drilling phase, thereby achieving the full process analysis and control of drilling operations. After receiving and starting drilling tasks, the method uses information board detection and PaddleOCR recognition services, and automatically saves construction information based on the identified drilling, closing, and sealing processes and construction parameters. When identifying the start of hole closing, the method enables the drill pipe withdrawal analysis service. When identifying the end of hole closing, the method stops the pipe withdrawal analysis service. The experimental results show that the recognition accuracy of the information board detection algorithm is 96%. The average time of PaddleOCR recognition algorithm is 17.51 ms, which is 25.25 ms lower than EasyOCR and 4.34 ms lower than Chinese OCR recognition algorithms, respectively; The accuracy of the PaddleOCR recognition algorithm has been improved by 5.75% and 2.29% compared to the other two recognition algorithms, respectively. The recall rate of the PaddleOCR recognition algorithm has been improved by 9.77% and 2.36% compared to the other two recognition algorithms, respectively. The pipe withdrawal analysis algorithm can effectively identify the number of pipes withdrawn on site, with an accuracy rate of approximately 95%.