ZHANG Yongchao, YU Zhiwei, DING Lili. Research on intelligent control algorithm of coal gangue sorting robot armbased on reinforcement learning[J]. Industry and Mine Automation, 2021, 47(1): 36-42. doi: 10.13272/j.issn.1671-251x.2020080047
Citation: ZHANG Yongchao, YU Zhiwei, DING Lili. Research on intelligent control algorithm of coal gangue sorting robot armbased on reinforcement learning[J]. Industry and Mine Automation, 2021, 47(1): 36-42. doi: 10.13272/j.issn.1671-251x.2020080047

Research on intelligent control algorithm of coal gangue sorting robot armbased on reinforcement learning

doi: 10.13272/j.issn.1671-251x.2020080047
  • Publish Date: 2021-01-20
  • The problems of the traditional gangue sorting robot arm control algorithms such as the grasping function method and the dynamic target grasping algorithm based on Ferrary method are relying on an accurate environment model and lacking adaptivity in the control process. At the same time, the problems of the traditional intelligent control algorithms such as deep deterministic policy gradient (DDPG) are excessive output actions and sparse rewards that are easily covered. In order to solve these problems, this study improves the neural network structure and reward function in the traditional DDPG algorithm, and proposes an improved DDPG algorithm based on reinforcement learning, which is suitable for handling six-degree-of-freedom gangue sorting robot arms. After the gangue enters the working space of the robot arm, the improved DDPG algorithm can make decisions according to the gangue position and robot arm state returned by the corresponding sensor, and can output a set of joint angle state control quantity to the corresponding motion controller. The algorithm can control the movement of the robot arm according to the gangue position and joint angle state control quantity, so that the robot arm moves to the nearby gangue to conduct gangue sorting. The simulation results show that compared with the traditional DDPG algorithm, the improved DDPG algorithm has the advantages of model-free versatility and adaptive learning of grasping pose in interaction with the environment. Moreover, the improved algorithm can be the first to converge to the maximum reward value encountered during exploration. The robot arm controlled by the improved DDPG algorithm has better policy generalization, smaller joint angle state control output and higher gangue sorting efficiency.

     

  • loading
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (133) PDF downloads(14) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return