CatBoost mine pressure appearance prediction based on Bayesian algorithm optimization
-
摘要: 通过传统的监测手段获取矿压数据并采用统计学或机器学习算法对矿压进行预测已不能满足矿山智能化发展要求,需要寻求新的方法提升矿压数据监测及矿压预测的准确性和实时性。基于三维相似物理模型试验,搭建分布式光纤监测系统,沿模型走向和高度2个方向预埋分布式光纤,在模拟工作面开采过程中采集来压数据,并引入光纤布里渊频移平均变化度作为判断是否来压的指标;通过对光纤监测数据进行噪声去除、归一化及相空间重构等预处理,将一维初始监测数据转换为三维数据;使用贝叶斯算法对CatBoost算法进行迭代参数寻优,在达到最大迭代次数后将最优参数组合装载到CatBoost算法中,通过训练得到矿压显现预测模型。结果表明:贝叶斯算法比传统网格搜索法的迭代次数更少、误差更小;与随机森林(RF)、梯度提升决策树(GBDT)和极值梯度提升树(XGBoost)算法相比,CatBoost算法的预测精度更高、泛化能力更强;基于贝叶斯算法优化的CatBoost矿压显现预测模型能准确预测出测试集中的3次来压,且整体预测趋势与实测值较为吻合,平均绝对误差为0.009 1,均方根误差为0.007 7,决定系数为0.933 9。Abstract: Obtaining mine pressure data through traditional monitoring methods and using statistical or machine learning algorithms to predict mine pressure can no longer meet the requirements of intelligent development in mines. It is necessary to seek new methods to improve the accuracy and real-time performance of mine pressure data monitoring and prediction. Based on three-dimensional similar physical model experiments, a distributed fiber optic monitoring system is constructed. The distributed fiber optic cables are pre-embedded along the model's direction and height. Pressure data is collected during the simulated mining process of the working face, and the optical fiber Brillouin frequency shift mean variation degree is introduced as an indicator to determine whether the pressure is coming. By preprocessing the optical fiber monitoring data such as noise removal, normalization and phase space reconstruction, the one-dimensional initial monitoring data is converted into three-dimensional data. The method uses Bayesian algorithm to iteratively optimize the parameters of the CatBoost algorithm. After reaching the maximum number of iterations, the optimal parameter combination is loaded into the CatBoost algorithm. The prediction model for mine pressure appearance is obtained by training. The results show that the Bayesian algorithm has fewer iterations and smaller errors than traditional grid search methods. Compared with random forest (RF), gradient boosting decision tree (GBDT) and extreme gradient boosting (XGBoost), the CatBoost algorithm has higher prediction accuracy and stronger generalization capability. The CatBoost mine pressure appearance prediction model optimized by the Bayesian algorithm can accurately predict the three weighting in the test set. The overall prediction trend is in line with the measured value, with mean absolute error of 0.0091, root-mean-square error of 0.0077, and determination coefficient of 0.933 9.
-
表 1 三维相似物理模型基本参数
Table 1. Basic parameters of the 3D physical similarity model
参数 值 参数 值 长度/mm 3600 宽度/mm 2000 高度/mm 2000 开挖步数 54 煤层厚度/mm 100 开挖步距/mm 50 几何相似比 1∶200 开挖时间间隔/h 0.5 容重相似比 1.56∶1 应力相似比 380∶1 表 2 光纤布里渊频移平均变化度相空间重构
Table 2. Phase space reconstruction of optical fiber Brillouin frequency shift mean variation degree
序号 开挖
距离
/mm$ {x_1} $ $ {x_2} $ $ {x_3} $ $ Y $ 序号 开挖
距离/mm$ {x_1} $ $ {x_2} $ $ {x_3} $ $ Y $ 1 200 0 0.000026 0.001579 0.002474 27 1500 0.864075 0.838863 0.721304 1.000 000 2 250 0.000026 0.001579 0.002474 0.002153 28 1550 0.838863 0.721304 1.000 000 0.869531 3 300 0.001579 0.002474 0.002153 0.001354 29 1600 0.721304 1.000 000 0.869531 0.413876 4 350 0.002474 0.002153 0.001354 0.002186 30 1650 1.000 000 0.869531 0.413876 0.574574 5 400 0.002153 0.001354 0.002186 0.019577 31 1700 0.869531 0.413876 0.574574 0.336719 6 450 0.001354 0.002186 0.019577 0.020404 32 1750 0.413876 0.574574 0.336719 0.422573 7 500 0.002186 0.019577 0.020404 0.015886 33 1800 0.574574 0.336719 0.422573 0.656925 8 550 0.019577 0.020404 0.015886 0.014098 34 1850 0.336719 0.422573 0.656925 0.544454 9 600 0.020404 0.015886 0.014098 0.028925 35 1900 0.422573 0.656925 0.544454 0.547804 10 650 0.015886 0.014098 0.028925 0.046407 36 1950 0.656925 0.544454 0.547804 0.572263 11 700 0.014098 0.028925 0.046407 0.027881 37 2000 0.544454 0.547804 0.572263 0.731946 12 750 0.028925 0.046407 0.027881 0.016708 38 2050 0.547804 0.572263 0.731946 0.453583 13 800 0.046407 0.027881 0.016708 0.452190 39 2100 0.572263 0.731946 0.453583 0.564985 14 850 0.027881 0.016708 0.452190 0.491797 40 2150 0.731946 0.453583 0.564985 0.815280 15 900 0.016708 0.452190 0.491797 0.689201 41 2200 0.453583 0.564985 0.815280 0.751002 16 950 0.452190 0.491797 0.689201 0.570864 42 2250 0.564985 0.815280 0.751002 0.745354 17 1000 0.491797 0.689201 0.570864 0.599183 43 2300 0.815280 0.751002 0.745354 0.692883 18 1050 0.689201 0.570864 0.599183 0.654041 44 2350 0.751002 0.745354 0.692883 0.796233 19 1100 0.570864 0.599183 0.654041 0.447253 45 2400 0.745354 0.692883 0.796233 0.940375 20 1150 0.599183 0.654041 0.447253 0.721625 46 2450 0.692883 0.796233 0.940375 0.840692 21 1200 0.654041 0.447253 0.721625 0.698059 47 2500 0.796233 0.940375 0.840692 0.722012 22 1250 0.447253 0.721625 0.698059 0.772544 48 2550 0.940375 0.840692 0.722012 0.673414 23 1300 0.721625 0.698059 0.772544 0.640361 49 2600 0.840692 0.722012 0.673414 0.593709 24 1350 0.698059 0.772544 0.640361 0.864075 50 2650 0.722012 0.673414 0.593709 0.473591 25 1400 0.772544 0.640361 0.864075 0.838863 51 2700 0.673414 0.593709 0.473591 0.394052 26 1450 0.640361 0.864075 0.838863 0.721304 表 3 CatBoost算法参数
Table 3. Parameters of CatBoost algorithm
参数 名称 作用 默认值 搜索范围 iterations 最大树数 提升精度 1 000 [40,130] learning_rate 学习率 提升精度 0.03 [0.01,0.30] depth 树的最大深度 提升精度 6 [3,10] l2_leaf_reg L2正则化 正则化,减小过拟合 3 [1,10] 表 4 参数优化结果对比
Table 4. Comparison of the parameter optimization results
优化方法 迭代次数 MAE RMSE 网格搜索法 55 0.073 0.089 贝叶斯算法 30 0.065 0.079 表 5 不同算法性能指标对比
Table 5. Comparison of performance indicators of different algorithms
算法 RMSE MAE R2 RF 0.017 9 0.013 5 0.874 6 GBDT 0.032 5 0.037 1 0.859 9 XGBoost 0.011 5 0.009 2 0.907 8 CatBoost 0.009 1 0.007 7 0.933 9 -
[1] 袁亮. “煤炭精准开采背景下的矿井地质保障”专辑特邀主编致读者[J]. 煤炭学报,2019,44(8):2275-2276.YUAN Liang. Invited editor-in-chief of the album "Mine Geological Guarantee in the Context of Precise Coal Mining" to readers[J]. Journal of China Coal Society,2019,44(8):2275-2276. [2] 张俊文,钟帅,梁珠擎. 矿区生态环境“三位一体”治理技术研究[J]. 煤炭技术,2020,39(6):106-109.ZHANG Junwen,ZHONG Shuai,LIANG Zhuqing. Study on "trinity" governance technology of mining area ecological environment[J]. Coal Technology,2020,39(6):106-109. [3] 王双明. 对我国煤炭主体能源地位与绿色开采的思考[J]. 中国煤炭,2020,46(2):11-16.WANG Shuangming. Thoughts about the main energy status of coal and green mining in China[J]. China Coal,2020,46(2):11-16. [4] 蓝航,陈东科,毛德兵. 我国煤矿深部开采现状及灾害防治分析[J]. 煤炭科学技术,2016,44(1):39-46.LAN Hang,CHEN Dongke,MAO Debing. Current status of deep mining and disaster prevention in China[J]. Coal Science and Technology,2016,44(1):39-46. [5] 崔铁军,马云东. 基于泛函网络的周期来压预测方法研究[J]. 计算机科学,2013,40(增刊1):243-246.CUI Tiejun,MA Yundong. Prediction of periodic weighting based on optimized functional networks[J]. Computer Science,2013,40(S1):243-246. [6] 赵毅鑫,杨志良,马斌杰,等. 基于深度学习的大采高工作面矿压预测分析及模型泛化[J]. 煤炭学报,2020,45(1):54-65.ZHAO Yixin,YANG Zhiliang,MA Binjie,et al. Deep learning prediction and model generalization of ground pressure for deep longwall face with large mining height[J]. Journal of China Coal Society,2020,45(1):54-65. [7] 贾澎涛,苗云风. 基于堆叠LSTM的多源矿压预测模型分析[J]. 矿业研究与开发,2021,41(8):79-82.JIA Pengtao,MIAO Yunfeng. Multi-source mine pressure prediction model analysis based on stacked-LSTM[J]. Mining Research and Development,2021,41(8):79-82. [8] 贺超峰,华心祝,杨科,等. 基于BP神经网络的工作面周期来压预测[J]. 安徽理工大学学报(自然科学版),2012,32(1):59-63.HE Chaofeng,HUA Xinzhu,YANG Ke,et al. Forecast of periodic weighting in working face based on back-propagation neural network[J]. Journal of Anhui University of Science and Technology(Natural Science),2012,32(1):59-63. [9] 李楠,王恩元,GE Maochen. 微震监测技术及其在煤矿的应用现状与展望[J]. 煤炭学报,2017,42(增刊1):83-96. doi: 10.13225/j.cnki.jccs.2016.0852LI Nan,WANG Enyuan,GE Maochen. Microseismic monitoring technique and its applications at coal mines:present status and future prospects[J]. Journal of China Coal Society,2017,42(S1):83-96. doi: 10.13225/j.cnki.jccs.2016.0852 [10] 王恩元,李忠辉,李德行,等. 电磁辐射监测技术装备在煤与瓦斯突出监测预警中的应用[J]. 煤矿安全,2020,51(10):46-51.WANG Enyuan,LI Zhonghui,LI Dexing,et al. Application of electromagnetic radiation monitoring equipment in monitoring and warning of coal and gas outburst[J]. Safety in Coal Mines,2020,51(10):46-51. [11] 张平松,许时昂,郭立全,等. 采场围岩变形与破坏监测技术研究进展及展望[J]. 煤炭科学技术,2020,48(3):14-48.ZHANG Pingsong,XU Shiang,GUO Liquan,et al. Prospect and progress of deformation and failure monitoring technology of surrounding rock in stope[J]. Coal Science and Technology,2020,48(3):14-48. [12] CHAI Jing,DU Wengang,YUAN Qiang,et al. Analysis of test method for physical model test of mining based on optical fiber sensing technology detection[J]. Optical Fiber Technology,2019,48:84-94. doi: 10.1016/j.yofte.2018.12.026 [13] VILLALBA S,CASAS J R. Application of optical fiber distributed sensing to health monitoring of concrete structures[J]. Mechanical Systems and Signal Processing,2013,39(1):441-451. [14] CHAPELEAU X,SEDRAN T,COTTINEAU L M,et al. Study of ballastless track structure monitoring by distributed optical fiber sensors on a real-scale mockup in laboratory[J]. Engineering Structures,2013,56:1751-1757. doi: 10.1016/j.engstruct.2013.07.005 [15] 柴敬,霍晓斌,钱云云,等. 采场覆岩变形和来压判别的分布式光纤监测模型试验[J]. 煤炭学报,2018,43(增刊1):36-43.CHAI Jing,HUO Xiaobin,QIAN Yunyun,et al. Model test for evaluating deformation and weighting of overlying strata by distributed optical fiber sensing[J]. Journal of China Coal Society,2018,43(S1):36-43. [16] 冀汶莉,刘艺欣,柴敬,等. 基于随机森林的矿压预测方法[J]. 采矿与岩层控制工程学报,2021,3(3):71-81.JI Wenli,LIU Yixin,CHAI Jing,et al. Mine pressure prediction method based on random forest[J]. Journal of Mining and Strata Control Engineering,2021,3(3):71-81. [17] 王润沛. 基于机器学习的分布式光纤监测覆岩变形矿压预测研究[D]. 西安: 西安科技大学, 2020.WANG Runpei. Research on prediction of deformed mine pressure of overburden under distributed optical fiber monitoring based on machine learning[D]. Xi'an: Xi'an University of Science and Technology, 2020. [18] 柴敬,王润沛,杜文刚,等. 基于XGBoost的光纤监测矿压时序预测研究[J]. 采矿与岩层控制工程学报,2020,2(4):64-71.CHAI Jing,WANG Runpei,DU Wengang,et al. Study on time series prediction of rock pressure by XGBoost in optical fiber monitoring[J]. Journal of Mining and Strata Control Engineering,2020,2(4):64-71. [19] 董力铭,曾文治,雷国庆. 分类梯度提升算法(CatBoost)与蝙蝠算法(Bat)耦合建模预测中国西北部地区水面蒸发量[J]. 节水灌溉,2021(2):63-69.DONG Liming,ZENG Wenzhi,LEI Guoqing. Coupling CatBoost model with bat algorithm to simulate the pan evaporation in northwest China[J]. Water Saving Irrigation,2021(2):63-69. [20] 郭步豪. 基于梯度提升机器学习算法的ECG身份识别[D]. 长春: 吉林大学, 2020.GUO Buhao. ECG identity recognition based on gradient boosting machine learning algorithm[D]. Changchun: Jilin University, 2020. [21] 李晓花. 基于贝叶斯算法的网络安全评估模型研究[J]. 电子设计工程,2021,29(5):154-158,163.LI Xiaohua. Research on network security evaluation model based on Bayesian algorithm[J]. Electronic Design Engineering,2021,29(5):154-158,163. [22] 李叶紫,王振友,周怡璐,等. 基于贝叶斯最优化的Xgboost算法的改进及应用[J]. 广东工业大学学报,2018,35(1):23-28. doi: 10.12052/gdutxb.170124LI Yezi,WANG Zhenyou,ZHOU Yilu,et al. The improvement and application of Xgboost method based on Bayesian optimization[J]. Journal of Guangdong University of Technology,2018,35(1):23-28. doi: 10.12052/gdutxb.170124