基于模糊深度Q网络的放煤智能决策方法

杨艺; 王圣文; 崔科飞; 费树岷

doi:10.13272/j.issn.1671-251x.2022090068

摘要: 在综放工作面放煤过程中，由于煤尘和降尘水雾对工作人员视线的影响，人工控制放煤存在过放、欠放问题。针对该问题，将液压支架尾梁看作智能体，把放煤过程抽象为马尔可夫最优决策，利用深度Q网络（DQN）对放煤口动作进行决策。然而DQN算法中存在过估计问题，因此提出了一种模糊深度Q网络（FDQN）算法，并应用于放煤智能决策。利用放煤过程中煤层状态的模糊特征构建模糊控制系统，以煤层状态中的煤炭数量和煤矸比例作为模糊控制系统的输入，并将模糊控制系统的输出动作代替DQN算法采用max操作选取目标网络输出Q值的动作，从而提高智能体的在线学习速率和增加放煤动作奖赏值。搭建综放工作面放煤模型，对分别基于DQN算法、双深度Q网络（DDQN）算法、FDQN算法的放煤工艺进行三维数值仿真，结果表明：FDQN算法的收敛速度最快，相对于DQN算法提高了31.6%，增加了智能体的在线学习速率；综合煤矸分界线直线度、尾梁上方余煤和放出体中的矸石数量3个方面，基于FDQN算法的放煤效果最好；基于FDQN算法的采出率最高、含矸率最低，相比基于DQN算法、DDQN算法的采出率分别提高了2.8%，0.7%，含矸率分别降低了2.1%，13.2%。基于FDQN算法的放煤智能决策方法可根据煤层赋存状态对液压支架尾梁动作进行调整，较好地解决了放煤过程中的过放、欠放问题。

Abstract: During the coal caving process in the fully mechanized caving face, due to the impact of coal dust and dust water mist on the workers' line of sight, there are problems of over-caving and under-caving in manually controlled coal caving. In order to solve this problem, the tail beam of the hydraulic support is regarded as an intelligent agent, and the coal caving process is abstracted as a Markov optimal decision. A deep Q-network (DQN) is used to make decisions on the action of the coal drawing port. However, there is an overestimation problem in the DQN algorithm. A fuzzy deep Q-network (FDQN) algorithm is proposed and applied to intelligent decision-making of coal caving. The fuzzy control system is constructed by using the fuzzy features of the coal seam status in the coal caving process. The coal quantity and the coal gangue ratio in the coal seam state are taken as the inputs of the fuzzy control system. The output action of the fuzzy control system is replaced with the action of the DQN algorithm using the max operation to select the output Q value of the target network. It improves the online learning rate of the agent and increases the reward value of coal caving action. The coal caving model for the fully mechanized caving face is constructed. The three-dimensional numerical simulation of the coal caving process based on DQN, double depth Q-network (DDQN), and FDQN algorithms is conducted respectively. The results show that the FDQN algorithm has the fastest convergence speed, which is 31.6% faster than the DQN algorithm. It increases the online learning rate of the intelligent agent. The coal caving effect based on the FDQN algorithm is the best from three aspects: the straightness of the coal gangue boundary, the remaining coal above the tail beam, and the amount of gangue in the released body. The extraction rate based on the FDQN algorithm is the highest and the gangue content is the lowest. Compared with the DQN algorithm and DDQN algorithm, the extraction rate of the FDQN algorithm has increased by 2.8% and 0.7% respectively, and the gangue content has decreased by 2.1% and 13.2% respectively. The FDQN-based intelligent decision-making method for coal caving can adjust the action of the hydraulic support tail beam based on the coal seam occurrence status. It effectively solves the problems of over-caving and under-caving during the coal caving process.

基于模糊深度Q网络的放煤智能决策方法

Intelligent decision-making method for coal caving based on fuzzy deep Q-network