Abstract:
The Euclidean distance is usually used in heuristic planning of Dyna_Q-learning based on reinforcement learning tasks of goal position. But it is not suitable for these tasks whose state space is not continuous in Euclidean space such as path planning of disaster rescue robot in underground coal mine. For the problem, the paper introduced the Laplacian Eigenmap whose computational complexity is lower in manifold learning, then proposed an improved Dyna_Q-learning algorithm based on manifold distance metric. The proposed algorithm is simulated in grid world that is similar to underground environment. The simulation results verified validity of the algorithm.