Multi-personnel underground trajectory prediction method based on Social Transformer
-
摘要: 目前煤矿井下人员轨迹预测方法中,Transformer与循环神经网络(RNN)、长短期记忆(LSTM)网络相比,在处理数据时不仅计算量小,同时还有效解决了梯度消失导致的长时依赖问题。但当环境中涉及多人同时运动时,Transformer对于场景中所有人员未来轨迹的预测会出现较大偏差。并且目前在井下多人轨迹预测领域尚未出现一种同时采用Transformer并考虑个体之间相互影响的模型。针对上述问题,提出一种基于Social Transformer的井下多人轨迹预测方法。首先对井下每一个人员独立建模,获取人员历史轨迹信息,通过Transformer编码器进行特征提取,接着由全连接层对特征进行表示,然后通过基于图卷积的交互层相互连接,该交互层允许空间上接近的网络彼此共享信息,计算预测对象在受到周围邻居影响时对周围邻居分配的注意力,从而提取其邻居的运动模式,继而更新特征矩阵,最后新的特征矩阵由Transformer解码器进行解码,输出对于未来时刻的人员位置信息预测。实验结果表明,Social Transformer的平均位移误差相较于Transformer降低了45.8%,且与其他主流轨迹预测方法LSTM,S−GAN,Trajectron++和Social−STGCNN相比分别降低了67.1%,35.9%,30.1%和10.9%,有效克服了煤矿井下多人场景中由于人员间互相影响导致预测轨迹失准的问题,提升了预测精度。
-
关键词:
- 电子围栏 /
- 井下多人轨迹预测 /
- Transformer /
- 交互编码 /
- Social Transformer
Abstract: Currently, in the prediction methods of underground personnel trajectories in coal mines, Transformer not only has lower computational complexity compared to recurrent neural network(RNN) and long short-term memory (LSTM), but also effectively solves the problem of long-term dependence caused by gradient disappearance when processing data. But when multi personnel are moving simultaneously in the environment, the Transformer's prediction of the future trajectories of all personnel in the scene will have a significant deviation. And currently, there is no model in the field of underground multi personnel trajectory prediction that simultaneously uses Transformer and considers the mutual influence between individuals. In order to solve the above problems, a multi personnel underground trajectory prediction method based on Social Transformer is proposed. Firstly, each individual is independently modeled to obtain their historical trajectory information. Feature extraction is performed using a Transformer encoder, followed by a fully connected layer to better represent the features. Secondly, an interactive layer based on graph convolution is used to connect each other, allowing spatially close networks to share information with each other. This layer calculates the attention that the predicted object allocates to its neighbors when influenced by them, extracts their motion patterns, and updates the feature matrix. Finally, the new feature matrix are decoded by the Transformer decoder to output predictions of future position information. The experimental results show that the average displacement error of Social Transformer is reduced by 45.8% compared to Transformer. Compared with other mainstream trajectory prediction methods such as LSTM, S-GAN, Trajectoron++, and S-STGCNN, the prediction errors are reduced by 67.1%, 35.9%, 30.1%, and 10.9%, respectively. This can effectively overcome the problem of inaccurate prediction trajectories caused by mutual influence among personnel in the underground multi personnel scenario of coal mines and improve prediction precision. -
表 1 多人轨迹预测结果
Table 1. Multi-personnel trajectory prediction result
方法 ADE 平均值 BIWI Hotel Crowds UCY MOT PETS SDD 自建数据集 LSTM 0.798 0.743 0.899 0.862 0.803 0.821 Transformer 0.470 0.422 0.534 0.542 0.523 0.498 S−GAN 0.561 0.492 0.681 0.588 0.562 0.577 Trajectron++ 0.415 0.331 0.366 0.422 0.397 0.386 Social−STGCNN 0.280 0.223 0.297 0.361 0.355 0.303 Social Transformer 0.240 0.194 0.265 0.355 0.295 0.270 表 2 不同预测序列长度下多人轨迹预测结果
Table 2. Prediction results of multi-personnel trajectory under different prediction sequence length
方法 ADE 预测12帧 预测20帧 预测28帧 LSTM 0.821 1.478 2.238 Transformer 0.487 0.682 0.940 Social Transformer 0.274 0.337 0.455 表 3 不同历史数据下多人轨迹预测结果
Table 3. Prediction results of multi-personnel trajectory under different historical data
方法 ADE 无缺失 缺失3帧 缺失6帧 LSTM 0.821 1.112 1.535 Transformer 0.498 0.573 0.662 Social Transformer 0.266 0.302 0.343 -
[1] 刘海忠. 电子围栏中心监控平台的设计与开发[D]. 武汉:华中师范大学,2012.LIU Haizhong. Design and development of center monitoring platform for electronic fence[D]. Wuhan:Central China Normal University,2012. [2] JEONG N Y,LIM S H,LIM E,et al. Pragmatic clinical trials for real-world evidence:concept and implementation[J]. Cardiovascular Pevention and Pharmacotherapy,2020,2(3):85-98. doi: 10.36011/cpp.2020.2.e12 [3] KLENSKE E D,ZEILINGER M N,SCHOLKOPF B,et al. Gaussian process-based predictive control for periodic error correction[J]. IEEE Transactions on Control Systems Technology,2016,24(1):110-121. doi: 10.1109/TCST.2015.2420629 [4] HUNT K J,SBARBARO D,ŻBIKOWSKI R,et al. Neural networks for control systems-a survey[J]. Automatica,1992,28(6):1083-1112. doi: 10.1016/0005-1098(92)90053-I [5] PRESTON D B. Spectral analysis and time series[J]. Technometrics,1983,25(2):213-214. doi: 10.1080/00401706.1983.10487866 [6] AKAIKE H. Fitting autoregreesive models for prediction[M]//PARZEN E,TANABE K,KITAGAWA G. Selected papers of Hirotugu Akaike. New York:Springer-Verlag New York Inc,1998:131-135. [7] ZHANG Jianjing,LIU Hongyi,CHANG Qing,et al. Recurrent neural network for motion trajectory prediction in human-robot collaborative assembly[J]. CIRP Annals,2020,69(1):9-12. doi: 10.1016/j.cirp.2020.04.077 [8] SHERSTINSKY A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network[J]. Physica D:Nonlinear Phenomena,2020. DOI: 10.1016/j.physd.2019.132306. [9] SONG Xiao,CHEN Kai,LI Xu,et al. Pedestrian trajectory prediction based on deep convolutional LSTM network[J]. IEEE Transactions on Intelligent Transportation Systems,2020,22(6):3285-3302. [10] SALZMANN T,IVANOVIC B,CHAKRAVARTY P,et al. Trajectron++:dynamically-feasible trajectory forecasting with heterogeneous data[C]. 16th European Conference on Computer Vision,Glasgow,2020:683-700. [11] MOHAMED A,QIAN Kun,ELHOSEINY M,et al. Social-STGCNN:a social spatio-temporal graph convolutional neural network for human trajectory prediction[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition,Seattle,2020:14424-14432. [12] SHANKAR V,YOUSEFI E,MANASHTY A,et al. Clinical-GAN:trajectory forecasting of clinical events using transformer and generative adversarial networks[J]. Artificial Intelligence in Medicine,2023,138. DOI: 10.1016/j.artmed.2023.102507. [13] HAN Kai,WANG Yunhe,CHEN Hanting,et al. A survey on vision transformer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(1):87-110. doi: 10.1109/TPAMI.2022.3152247 [14] GRAHAM B,EL-NOUBY A,TOUVRON H,et al. LeViT:a vision transformer in ConvNet’s clothing for faster inference[C]. IEEE/CVF International Conference on Computer Vision,Montreal,2021:12259-12269. [15] ARNAB A,DEHGHANI M,HEIGOLD G,et al. ViViT:a video vision transformer[C]. IEEE/CVF International Conference on Computer Vision,Montreal,2021:6836-6846. [16] VASWANI A,SHAZEER N,PARMAR N,et al. Attention is all you need[C]. 31st Conference on Neural Information Processing Systems,Long Beach,2017:5998-6008. [17] 刘赟. ReLU激活函数下卷积神经网络的不同类型噪声增益研究[D]. 南京:南京邮电大学,2023.LIU Yun. Research on different types of noise gain in convolutional neural networks under ReLU activation function[D]. Nanjing:Nanjing University of Posts and Telecommunications,2023. [18] 靳晶晶,王佩. 基于卷积神经网络的图像识别算法研究[J]. 通信与信息技术,2022(2):76-81.JIN Jingjing,WANG Pei. Research on image recognition algorithm based on convolutional neural network[J]. Communications and Information Technology,2022(2):76-81. [19] ALAHI A,GOEL K,RAMANATHAN V,et al. Social LSTM:human trajectory prediction in crowded spaces[C]. IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,2016:961-971. [20] BERGSTRA J,BREULEUX O,BASTIEN F,et al. Theano:a CPU and GPU math compiler in Python[C]. The 9th Python in Science Conference,2010. DOI: 10.25080/majora-92bf1922-003. [21] PESARANGHADER A,WANG Yiping,HAVAEI M. CT-SGAN:computed tomography synthesis GAN[C]// ENGELHARDT S,OKSUZ I,ZHU Dajiang,et al. Deep generative models,and data augmentation,labelling,and imperfections. Berlin:Springer-Verlag,2021:67-79.