Intelligent decision-making method for coal caving based on fuzzy deep Q-network
-
摘要: 在综放工作面放煤过程中,由于煤尘和降尘水雾对工作人员视线的影响,人工控制放煤存在过放、欠放问题。针对该问题,将液压支架尾梁看作智能体,把放煤过程抽象为马尔可夫最优决策,利用深度Q网络(DQN)对放煤口动作进行决策。然而DQN算法中存在过估计问题,因此提出了一种模糊深度Q网络(FDQN)算法,并应用于放煤智能决策。利用放煤过程中煤层状态的模糊特征构建模糊控制系统,以煤层状态中的煤炭数量和煤矸比例作为模糊控制系统的输入,并将模糊控制系统的输出动作代替DQN算法采用max操作选取目标网络输出Q值的动作,从而提高智能体的在线学习速率和增加放煤动作奖赏值。搭建综放工作面放煤模型,对分别基于DQN算法、双深度Q网络(DDQN)算法、FDQN算法的放煤工艺进行三维数值仿真,结果表明:FDQN算法的收敛速度最快,相对于DQN算法提高了31.6%,增加了智能体的在线学习速率;综合煤矸分界线直线度、尾梁上方余煤和放出体中的矸石数量3个方面,基于FDQN算法的放煤效果最好;基于FDQN算法的采出率最高、含矸率最低,相比基于DQN算法、DDQN算法的采出率分别提高了2.8%,0.7%,含矸率分别降低了2.1%,13.2%。基于FDQN算法的放煤智能决策方法可根据煤层赋存状态对液压支架尾梁动作进行调整,较好地解决了放煤过程中的过放、欠放问题。Abstract: During the coal caving process in the fully mechanized caving face, due to the impact of coal dust and dust water mist on the workers' line of sight, there are problems of over-caving and under-caving in manually controlled coal caving. In order to solve this problem, the tail beam of the hydraulic support is regarded as an intelligent agent, and the coal caving process is abstracted as a Markov optimal decision. A deep Q-network (DQN) is used to make decisions on the action of the coal drawing port. However, there is an overestimation problem in the DQN algorithm. A fuzzy deep Q-network (FDQN) algorithm is proposed and applied to intelligent decision-making of coal caving. The fuzzy control system is constructed by using the fuzzy features of the coal seam status in the coal caving process. The coal quantity and the coal gangue ratio in the coal seam state are taken as the inputs of the fuzzy control system. The output action of the fuzzy control system is replaced with the action of the DQN algorithm using the max operation to select the output Q value of the target network. It improves the online learning rate of the agent and increases the reward value of coal caving action. The coal caving model for the fully mechanized caving face is constructed. The three-dimensional numerical simulation of the coal caving process based on DQN, double depth Q-network (DDQN), and FDQN algorithms is conducted respectively. The results show that the FDQN algorithm has the fastest convergence speed, which is 31.6% faster than the DQN algorithm. It increases the online learning rate of the intelligent agent. The coal caving effect based on the FDQN algorithm is the best from three aspects: the straightness of the coal gangue boundary, the remaining coal above the tail beam, and the amount of gangue in the released body. The extraction rate based on the FDQN algorithm is the highest and the gangue content is the lowest. Compared with the DQN algorithm and DDQN algorithm, the extraction rate of the FDQN algorithm has increased by 2.8% and 0.7% respectively, and the gangue content has decreased by 2.1% and 13.2% respectively. The FDQN-based intelligent decision-making method for coal caving can adjust the action of the hydraulic support tail beam based on the coal seam occurrence status. It effectively solves the problems of over-caving and under-caving during the coal caving process.
-
表 1 模糊推理规则
Table 1. Fuzzy inference rule
$ {m_t} $ $ {\omega _t} $ NB ZO PB NB NB NB NB ZO NB PB PB PB NB PB PB 表 2 基于不同算法的单轮群组放煤数据
Table 2. Single round group coal caving data based on different algorithms
序号 煤炭数量/个 矸石数量/个 采出率/% 含矸率/% DQN DDQN FDQN DQN DDQN FDQN DQN DDQN FDQN DQN DDQN FDQN 1 1 154 1 162 1 233 58 59 67 93.5 94.1 99.0 4.8 4.8 5.2 2 1 130 1 182 1 144 56 74 35 91.5 95.7 92.7 4.7 5.9 3.0 3 1 173 1 173 1 186 59 65 53 95.0 95.0 96.1 4.8 5.3 4.3 4 1 158 1 218 1 218 55 87 78 93.8 98.7 98.7 4.5 6.7 6.0 5 1 110 1 192 1 209 35 76 72 89.9 96.5 97.9 3.1 6.0 5.6 6 1 137 1 163 1 171 60 51 51 92.1 94.2 94.8 5.0 4.2 4.2 7 1 159 1 166 1 155 62 62 51 93.9 94.4 93.5 5.1 5.0 4.2 8 1 166 1 189 1 181 60 67 60 94.4 96.3 95.7 5.1 5.3 4.8 9 1 165 1 146 1 174 60 56 44 94.4 92.8 95.1 4.9 4.7 3.6 10 1 181 1 183 1 205 61 60 63 95.7 95.8 97.6 4.9 4.8 5.0 平均值 1 153.3 1 177.4 1 187.5 56.6 65.7 56.7 93.4 95.4 96.1 4.7 5.3 4.6 -
[1] 李爽,薛广哲,方新秋,等. 煤矿智能化安全保障体系及关键技术[J]. 煤炭学报,2020,45(6):2320-2330.LI Shuang,XUE Guangzhe,FANG Xinqiu,et al. Coal mine intelligent safety system and key technologies[J]. Journal of China Coal Society,2020,45(6):2320-2330. [2] 葛世荣,郝尚清,张世洪,等. 我国智能化采煤技术现状及待突破关键技术[J]. 煤炭科学技术,2020,48(7):28-46.GE Shirong,HAO Shangqing,ZHANG Shihong,et al. Status of intelligent coal mining technology and potential key technologies in China[J]. Coal Science and Technology,2020,48(7):28-46. [3] 张守祥,张学亮,刘帅,等. 智能化放顶煤开采的精确放煤控制技术[J]. 煤炭学报,2020,45(6):2008-2020.ZHANG Shouxiang,ZHANG Xueliang,LIU Shuai,et al. Intelligent precise control technology of fully mechanized top coal caving face[J]. Journal of China Coal Society,2020,45(6):2008-2020. [4] LIANG Minfu,HU Chengjun,YU Rui,et al. Optimization of the process parameters of fully mechanized top-coal caving in thick-seam coal using BP neural networks[J]. Sustainability,2022,14(3):1340-1357. doi: 10.3390/su14031340 [5] 王国法,庞义辉. 特厚煤层大采高综采综放适应性评价和技术原理[J]. 煤炭学报,2018,43(1):33-42.WANG Guofa,PANG Yihui. Full-mechanized coal mining and caving mining method evaluation and key technology for thick coal seam[J]. Journal of China Coal Society,2018,43(1):33-42. [6] 霍昱名. 厚煤层综放开采顶煤破碎机理及智能化放煤控制研究[D]. 太原: 太原理工大学, 2021.HUO Yuming. Research on failure mechanism and intelligent drawing control of top coal in thick coal seam[D]. Taiyuan: Taiyuan University of Technology, 2021. [7] 马英. 基于记忆放煤时序控制的智能放煤模式研究[J]. 煤矿机电,2015,36(2):1-5. doi: 10.3969/j.issn.1001-0874.2015.02.001MA Ying. Research on intelligent coal caving system based on memory coal caving sequential control[J]. Colliery Mechanical & Electrical Technology,2015,36(2):1-5. doi: 10.3969/j.issn.1001-0874.2015.02.001 [8] 李庆元,杨艺,李化敏,等. 基于Q-learning模型的智能化放顶煤控制策略[J]. 工矿自动化,2020,46(1):72-79.LI Qingyuan,YANG Yi,LI Huamin,et al. Intelligent control strategy for top coal caving based on Q-learning model[J]. Industry and Mine Automation,2020,46(1):72-79. [9] 罗开成,高阳,杨艺,等. 基于均值偏差奖赏函数的放煤口控制策略研究[J]. 煤炭工程,2022,54(9):105-111.LUO Kaicheng,GAO Yang,YANG Yi,et al. Intelligent control strategy of drawing window in top-coal caving based on mean deviation reward function[J]. Coal Engineering,2022,54(9):105-111. [10] 杨艺,李庆元,李化敏,等. 基于批量式强化学习的群组放煤智能决策研究[J]. 煤炭科学技术,2022,50(10):188-197. doi: 10.13199/j.cnki.cst.2020-1438YANG Yi,LI Qingyuan,LI Huamin,et al. Research on intelligent decision-making for group top-coal caving based on batch reinforcement learning[J]. Coal Science and Technology,2022,50(10):188-197. doi: 10.13199/j.cnki.cst.2020-1438 [11] YANG Yi,LI Xinwei,LI Huaming,et al. Deep Q-network for optimal decision for top-coal caving[J]. Energies,2020,13(7):1618-1630. doi: 10.3390/en13071618 [12] YANG Yi,LIN Zhiwei,LI Bingfeng,et al. Hidden Markov random field for multi-agent optimal decision in top-coal caving[J]. IEEE Access,2020,8:76596-76609. doi: 10.1109/ACCESS.2020.2984786 [13] WANG Haixing,YANG Yi,LIN Zhiwei,et al. Multi-agent reinforcement learning with optimal equivalent action of neighborhood[J]. Actuators,2022,11(4):99. DOI: 10.3390/act11040099. [14] 袁甜甜,李凤莲,张雪英,等. 特征降维的深度强化学习脑卒中分类预测研究[J]. 重庆理工大学学报(自然科学),2023,37(3):194-203.YUAN Tiantian,LI Fenglian,ZHANG Xueying,et al. Classification and prediction research of stroke based on deep reinforcement learning with feature dimension reduction[J]. Journal of Chongqing University of Technology(Natural Science),2023,37(3):194-203. [15] SUTTON R S. Learning to predict by the methods of temporal differences[J]. Machine Learning,1988,3(1):9-44. [16] MNIH V,KAVUKCUOGLU K,SLIVER D,et al. Human-level control through deep reinforcement learning[J]. Nature,2015,518:529-533. doi: 10.1038/nature14236 [17] 封硕,舒红,谢步庆. 基于改进深度强化学习的三维环境路径规划[J]. 计算机应用与软件,2021,38(1):250-255. doi: 10.3969/j.issn.1000-386x.2021.01.042FENG Shuo,SHU Hong,XIE Buqing. 3D environment path planning based on improved deep reinforcement learning[J]. Computer Applications and Software,2021,38(1):250-255. doi: 10.3969/j.issn.1000-386x.2021.01.042 [18] 黎声益,马玉敏,刘鹃. 基于双深度Q学习网络的面向设备负荷稳定的智能车间调度方法[J]. 计算机集成制造系统,2023,29(1):91-99. doi: 10.13196/j.cims.2023.01.008LI Shengyi,MA Yumin,LIU Juan. Smart shop floor scheduling method for equipment load stabilization based on double deep Q-learning network[J]. Computer Integrated Manufacturing Systems,2023,29(1):91-99. doi: 10.13196/j.cims.2023.01.008 [19] 李忠信,王大龙,庄佳才,等. 基于遗传模糊控制的风电机组偏航系统疲劳载荷研究[J]. 动力工程学报,2022,42(8):745-752,768. doi: 10.19805/j.cnki.jcspe.2022.08.008LI Zhongxin,WANG Dalong,ZHUANG Jiacai,et al. Research on fatigue suppression of wind turbine yaw system based on genetic fuzzy control[J]. Journal of Chinese Society of Power Engineering,2022,42(8):745-752,768. doi: 10.19805/j.cnki.jcspe.2022.08.008 [20] 张虎雄,李红卫,马祥. 模糊控制在煤矿智能化开采中的应用[J]. 煤矿机械,2022,43(12):206-210. doi: 10.13436/j.mkjx.202212062ZHANG Huxiong,LI Hongwei,MA Xiang. Application of fuzzy control in intelligent mining of coal mine[J]. Coal Mine Machinery,2022,43(12):206-210. doi: 10.13436/j.mkjx.202212062 [21] 沈志熙,代东林,赵凯. 基于多特征分步模糊推理的边缘检测算法[J]. 电子科技大学学报,2014,43(3):381-387. doi: 10.3969/j.issn.1001-0548.2014.03.011SHEN Zhixi,DAI Donglin,ZHAO Kai. Edge detection based on multi-features and step-by-step fuzzy inference[J]. Journal of University of Electronic Science and Technology of China,2014,43(3):381-387. doi: 10.3969/j.issn.1001-0548.2014.03.011