Research on the application of improved Adam training optimizer in gas emission prediction
-
摘要:
目前对基于神经网络的瓦斯涌出量预测模型的研究主要集中在瓦斯涌出问题上的表现,对模型训练中优化器性质的关注与改进较少。基于神经网络的瓦斯涌出量预测模型的训练常采用Adam算法,但Adam算法的不收敛性易造成预测模型的最佳超参数丢失,导致预测效果不佳。针对上述问题,对Adam优化器进行改进,在Adam算法中引入一种随迭代更新的矩估计参数,在保证收敛速率的同时获得更强的收敛性。以山西焦煤西山煤电集团马兰矿某回采工作面为例,在相同的循环神经网络(RNN)预测模型下测试了改进的Adam优化器在瓦斯涌出量预测中的训练效率、模型收敛性与预测准确度。测试结果表明:① 当隐藏层数为2和3时,改进的Adam算法较Adam算法的运行时间分别缩短了18.83,13.72 s。当隐藏层数为2时,Adam算法达到最大迭代数但仍没有收敛,而改进的Adam算法达到了收敛。② 在不同隐藏层节点数量下,Adam算法都没有在最大迭代步长内收敛,而改进的Adam算法均达到了收敛,且CPU运行时间较Adam算法分别缩短16.17,188.83,22.15 s。改进的Adam算法预测趋势的正确性更高。③ 使用tanh函数时,改进的Adam算法的运行时间较Adam算法分别缩短了22.15,41.03 s,使用ReLU函数时,改进的Adam算法与Adam算法运行时间相差不大。④ 使用改进后的Adam算法做遍历网格搜索,得到最佳的模型超参数为{3,20,tanh},均方误差、归一化的均方误差、运行时间分别为0.078 5,0.000 101和32.59 s。改进的Adam算法给出的最优模型对于待预测范围内出现的几个低谷及峰值趋势判断均正确,在训练集上的拟合程度适当,未见明显的过拟合现象。
Abstract:Currently, research on neural network-based gas emission prediction models mainly focuses on the performance of gas emission problems, with less attention and improvement on the optimizer properties in model training. The training of gas emission prediction models based on neural networks often uses the Adam algorithm. But the non-convergence of the Adam algorithm can easily lead to the loss of the best hyperparameters of the prediction model, resulting in poor prediction performance. In order to solve the above problems, the Adam optimizer is improved by introducing a moment estimation parameter that updates iteratively in the Adam algorithm, achieving stronger convergence while ensuring convergence rate. Taking a certain mining face of Malan Mine in Xishan Coal and Power Group of Shanxi Coking Coal as an example, the training efficiency, model convergence, and prediction accuracy of the improved Adam optimizer in gas emission prediction are tested under the same recurrent neural network (RNN) prediction model. The test results show the following points. ① When the number of hidden layers is 2 and 3, the improved Adam algorithm reduces the running time by 18.83 and seconds 13.72 seconds respectively compared to the Adam algorithm. When the number of hidden layers is 2, the Adam algorithm reaches its maximum iteration number but still does not converge, while the improved Adam algorithm achieves convergence. ② Under different numbers of hidden layer nodes, the Adam algorithm does not converge within the maximum iteration step, while the improved Adam algorithm achieves convergence. The CPU running time is reduced by 16.17, 188.83 and 22.15 seconds respectively compared to the Adam algorithm. The improved Adam algorithm has higher accuracy in predicting trends. ③ When using the tanh function, the improved Adam algorithm reduces the running time by 22.15 seconds and 41.03 seconds respectively compared to the Adam algorithm. When using the ReLU function, the running time of the improved Adam algorithm and the Adam algorithm is not significantly different. ④ Using the improved Adam algorithm for traversal grid search, the optimal model hyperparameters are obtained as {3,20, tanh}, with mean square error, normalized mean square error, and running time of 0.078 5, 0.000 101, and 32.59 seconds, respectively. The optimal model given by the improved Adam's algorithm correctly judges the trends of several valleys and peaks that occur within the predicted range. The fitting degree on the training set is appropriate, and there is no obvious overfitting phenomenon.
-
表 1 $ \mathrm{\lambda }={10}^{-6} $时LASSO回归系数
Table 1. LASSO regression coefficients at $ \mathrm{\lambda }={10}^{-6} $
影响因素 回归系数 影响因素 回归系数 回风量 0.43 温度 0.034 平均埋深 0.43 日进尺 0.031 初始瓦斯含量 −0.42 进风量 −0.009 邻近煤层瓦斯含量 0.30 煤体硬度 0 本煤层倾角 0.22 邻近煤层标高 0 预抽瓦斯总量 0.21 邻近煤层厚度 0 卸压瓦斯总量 0.19 邻近煤层倾角 0 本煤层厚度 −0.060 煤层间距 0 工作面标高 −0.060 采掘点与陷落柱距离 0 初始瓦斯压力 0.039 表 2 不同隐藏层数下运行结果比较
Table 2. Comparison of results under different hidden layers
算法 隐藏层数 MSE NMSE 趋势正确性 CPU运行时间/s Adam 1 0.224 7 0.000 282 正确 29.11 2 0.174 1 0.000 236 不正确 52.69 3 0.287 3 0.000 414 基本正确 51.31 改进的Adam 1 0.218 6 0.000 257 正确 29.72 2 0.080 4 0.000 094 正确 33.86 3 0.28 52 0.000 101 基本正确 37.59 表 3 不同隐藏层节点数下运行结果比较
Table 3. Comparison of results with different number of hidden layer nodes
算法 隐藏层节点数 MSE NMSE 趋势正确性 CPU运行时间/s Adam 15 0.156 0 0.000 201 基本正确 55.51 20 0.174 1 0.000 236 不正确 52.69 25 1.2330 0.002 250 基本正确 53.53 改进的
Adam15 0.507 6 0.000 339 正确 39.34 20 0.080 4 0.000 094 正确 33.86 25 0.092 0 0.000 105 基本正确 31.38 表 4 不同激活函数下运行结果比较
Table 4. Comparison of results under different activation functions
算法 激活
函数层数与
节点数MSE NMSE 趋势正确性 CPU运行
时间/sAdam tanh 2, 25 1.233 0 0.002 30 基本正确 53.53 3, 15 0.388 3 0.000 42 正确 79.19 ReLU 2, 25 0.164 7 0.000 22 不正确 54.23 3, 15 0.131 9 0.000 18 基本正确 86.89 改进的Adam tanh 2, 25 0.092 0 0.000 11 正确 31.38 3, 15 0.430 8 0.000 46 基本正确 38.16 ReLU 2, 25 0.174 9 0.000 20 基本正确 58.18 3, 15 0.142 2 0.000 19 不正确 92.02 -
[1] 景国勋,刘孟霞. 2015—2019年我国煤矿瓦斯事故统计与规律分析[J]. 安全与环境学报,2022,22(3):1680-1686.JING Guoxun,LIU Mengxia. Statistics and analysis of coal mine gas accidents in China from 2015 to 2019[J]. Journal of Safety and Environment,2022,22(3):1680-1686. [2] 宁超,王婷婷. 瓦斯地质与瓦斯防治进展[M]. 北京:煤炭工业出版社,2007.NING Chao,WANG Tingting. Advances in gas geology and gas prevention and control[M]. Beijing:China Coal Industry Publishing House,2007. [3] 高金升,高娓娓. 新建矿井的瓦斯涌出量预测地质方法及其应用[J]. 煤,2009(8):21-24,40.GAO Jinsheng,GAO Weiwei. Application of gas emission geological prediction method of new mine[J]. Coal,2009(8):21-24,40. [4] 谢建林,张爱绒,孙晓元. 基于分源预测法的瓦斯抽放关键参数研究[J]. 太原理工大学学报,2013,44(2):213-217.XIE Jianlin,ZHANG Airong,SUN Xiaoyuan. The study of gas drainage key parameters based on different-source forecast method[J]. Journal of Taiyuan University of Technology,2013,44(2):213-217. [5] 申凯. 分源预测法在大型矿井瓦斯涌出量预测中的应用[J]. 能源技术与管理,2018,43(3):1-4.SHEN Kai. Application of different-source forecast method to gas emission prediction in large scale mines[J]. Energy Technology and Management,2018,43(3):1-4. [6] ALI D,FRIMPONG S. Artificial intelligence,machine learning and process automation:existing knowledge frontier and way forward for mining sector[J]. Artificial Intelligence Review,2020,53:6025-6042. doi: 10.1007/s10462-020-09841-6 [7] ZENG Jun,LI Qinsheng. Research on prediction accuracy of coal mine gas emission based on grey prediction model[J]. Processes,2021,9. DOI: 10.3390/pr9071147. [8] 成小雨,周爱桃,郭焱振,等. 基于随机森林与支持向量机的回采工作面瓦斯涌出量预测方法[J]. 煤矿安全,2022,53(10):205-211.CHENG Xiaoyu,ZHOU Aitao,GUO Yanzhen,et al. Prediction method of gas emission based on random forest and support vector machine[J]. Safety in Coal Mines,2022,53(10):205-211. [9] ZHAO Xiaohu,WANG Gang,TAN Dejian,et al. On-line least squares support vector machine algorithm in gas prediction[J]. Mining Science and Technology (China),2009,19(2):194-198. doi: 10.1016/S1674-5264(09)60037-5 [10] 汪明,王建军. 基于随机森林的回采工作面瓦斯涌出量预测模型[J]. 煤矿安全,2012,43(8):182-185.WANG Ming,WANG Jianjun. Gas emission prediction model of stope based on random forests[J]. Safety in Coal Mines,2012,43(8):182-185. [11] 田虎军,胡新社,贾世有,等. 基于极限学习机的煤矿瓦斯涌出量预测研究[J]. 能源技术与管理,2021,46(1):190-192.TIAN Hujun,HU Xinshe,JIA Shiyou,et al. Prediction model of gas emission based on extreme learning machine(ELM)[J]. Energy Technology and Management,2021,46(1):190-192. [12] 黄凯波,朱权洁,张尔辉. 基于灰色理论与BP神经网络瓦斯涌出量预测研究[J]. 华北科技学院学报,2020,17(2):16-22.HUANG Kaibo,ZHU Quanjie,ZHANG Erhui. Based on grey theory and BP neural network gas emission prediction research[J]. Journal of North China Institute of Science and Technology,2020,17(2):16-22. [13] JIA Pengtao,LIU Hangduo,WANG Sujian,et al. Research on a mine gas concentration forecasting model based on a GRU network[J]. IEEE Access,2020,8:38023-38031. doi: 10.1109/ACCESS.2020.2975257 [14] 刘超,张爱琳,李树刚,等. 基于Pearson特征选择的LSTM工作面瓦斯浓度预测模型及应用[J/OL]. 煤炭科学技术:1-9[2023-11-29]. https://doi.org/10.13199/j.cnki.cst.2022-1618.LIU Chao,ZHANG Ailin,LI Shugang,et al. LSTM-Pearson gas concentration prediction model feature selection and its application[J/OL]. Coal Science and Technology:1-9[2023-11-29].https://doi.org/10.13199/j.cnki.cst.2022-1618. [15] 马晟翔,李希建. 改进的BP神经网络煤矿瓦斯涌出量预测模型[J]. 矿业研究与开发,2019,39(10):138-142.MA Shengxiang,LI Xijian. Study on prediction model of coal mine gas emission by improved BP neural network[J]. Mining Research and Development,2019,39(10):138-142. [16] 徐刚,王磊,金洪伟,等. 因子分析法与BP神经网络耦合模型对回采工作面瓦斯涌出量预测[J]. 西安科技大学学报,2019,39(6):965-971.XU Gang,WANG Lei,JIN Hongwei,et al. Gas emission prediction in mining face by factor analysis and BP neural network coupling model[J]. Journal of Xi'an University of Science and Technology,2019,39(6):965-971. [17] REDDI S J,KALE S S,KUMAR S. On the convergence of Adam and beyond[EB/OL]. [2023-02-20].https://arxiv.org/abs/1904.09237. [18] WILSON A C,ROELOFS R,STERN M,et al. The marginal value of adaptive gradient methods in machine learning[EB/OL]. [2023-02-20].https://arxiv.org/abs/1705.08292. [19] 付华,付昱,赵俊程,等. 基于KPCA−ARIMA算法的瓦斯涌出量预测[J]. 辽宁工程技术大学学报(自然科学版),2022,41(5):406-412.FU Hua,FU Yu,ZHAO Juncheng et al. Prediction of gas emission based on KPCA-ARIMA algorithm[J]. Journal of Liaoning Technical University(Natural Science),2022,41(5):406-412. [20] 王洪胜,吴兵,雷柏伟. 综放工作面瓦斯积聚影响因素模拟研究[J]. 煤矿安全,2018,49(3):151-154,159.WANG Hongsheng,WU Bing,LEI Baiwei. Numerical simulation study on influence factors of gas accumulation at fully mechanized caving face[J]. Safety in Coal Mines,2018,49(3):151-154,159. [21] KINGMA D P,BA J. Adam:a method for stochastic optimization[J]. Computer Science,2014. DOI: 10.48550/arXiv.1412.6980.