改进Adam优化器在瓦斯涌出量预测中的应用研究

Research on the application of improved Adam training optimizer in gas emission prediction

  • 摘要: 目前对基于神经网络的瓦斯涌出量预测模型的研究主要集中在瓦斯涌出问题上的表现,对模型训练中优化器性质的关注与改进较少。基于神经网络的瓦斯涌出量预测模型的训练常采用Adam算法,但Adam算法的不收敛性易造成预测模型的最佳超参数丢失,导致预测效果不佳。针对上述问题,对Adam优化器进行改进,在Adam算法中引入一种随迭代更新的矩估计参数,在保证收敛速率的同时获得更强的收敛性。以山西焦煤西山煤电集团马兰矿某回采工作面为例,在相同的循环神经网络(RNN)预测模型下测试了改进的Adam优化器在瓦斯涌出量预测中的训练效率、模型收敛性与预测准确度。测试结果表明:① 当隐藏层数为2和3时,改进的Adam算法较Adam算法的运行时间分别缩短了18.83,13.72 s。当隐藏层数为2时,Adam算法达到最大迭代数但仍没有收敛,而改进的Adam算法达到了收敛。② 在不同隐藏层节点数量下,Adam算法都没有在最大迭代步长内收敛,而改进的Adam算法均达到了收敛,且CPU运行时间较Adam算法分别缩短16.17,188.83,22.15 s。改进的Adam算法预测趋势的正确性更高。③ 使用tanh函数时,改进的Adam算法的运行时间较Adam算法分别缩短了22.15,41.03 s,使用ReLU函数时,改进的Adam算法与Adam算法运行时间相差不大。④ 使用改进后的Adam算法做遍历网格搜索,得到最佳的模型超参数为3,20,tanh,均方误差、归一化的均方误差、运行时间分别为0.078 5,0.000 101和32.59 s。改进的Adam算法给出的最优模型对于待预测范围内出现的几个低谷及峰值趋势判断均正确,在训练集上的拟合程度适当,未见明显的过拟合现象。

     

    Abstract: Currently, research on neural network-based gas emission prediction models mainly focuses on the performance of gas emission problems, with less attention and improvement on the optimizer properties in model training. The training of gas emission prediction models based on neural networks often uses the Adam algorithm. But the non-convergence of the Adam algorithm can easily lead to the loss of the best hyperparameters of the prediction model, resulting in poor prediction performance. In order to solve the above problems, the Adam optimizer is improved by introducing a moment estimation parameter that updates iteratively in the Adam algorithm, achieving stronger convergence while ensuring convergence rate. Taking a certain mining face of Malan Mine in Xishan Coal and Power Group of Shanxi Coking Coal as an example, the training efficiency, model convergence, and prediction accuracy of the improved Adam optimizer in gas emission prediction are tested under the same recurrent neural network (RNN) prediction model. The test results show the following points. ① When the number of hidden layers is 2 and 3, the improved Adam algorithm reduces the running time by 18.83 and seconds 13.72 seconds respectively compared to the Adam algorithm. When the number of hidden layers is 2, the Adam algorithm reaches its maximum iteration number but still does not converge, while the improved Adam algorithm achieves convergence. ② Under different numbers of hidden layer nodes, the Adam algorithm does not converge within the maximum iteration step, while the improved Adam algorithm achieves convergence. The CPU running time is reduced by 16.17, 188.83 and 22.15 seconds respectively compared to the Adam algorithm. The improved Adam algorithm has higher accuracy in predicting trends. ③ When using the tanh function, the improved Adam algorithm reduces the running time by 22.15 seconds and 41.03 seconds respectively compared to the Adam algorithm. When using the ReLU function, the running time of the improved Adam algorithm and the Adam algorithm is not significantly different. ④ Using the improved Adam algorithm for traversal grid search, the optimal model hyperparameters are obtained as 3,20, tanh, with mean square error, normalized mean square error, and running time of 0.078 5, 0.000 101, and 32.59 seconds, respectively. The optimal model given by the improved Adam's algorithm correctly judges the trends of several valleys and peaks that occur within the predicted range. The fitting degree on the training set is appropriate, and there is no obvious overfitting phenomenon.

     

/

返回文章
返回