基于Hadoop生态圈的选煤数据中台设计

赵鑫; 王然风; 付翔

doi:10.13272/j.issn.1671-251x.2021040004

基于Hadoop生态圈的选煤数据中台设计

Design of coal preparation data center platform based on Hadoop ecosystem

摘要

摘要: 针对现有选煤厂信息管理系统采用的接口不规范，导致数据重复采集，且各系统相互独立，对多源异构数据处理能力弱等问题，基于Hadoop生态圈大数据技术，提出了一种基于Hadoop生态圈的选煤数据中台设计方案。通过主数据管理系统、企业服务总线定义数据标准实现系统集成；设计归一化、相关系数矩阵和噪声异常点检测程序实现数据处理；结合D-S（Dempster-Shafer）证据理论、Hadoop与Hive数据仓库设计多源异构数据融合子系统，实现数据融合；利用Highcharts数据可视化组件实现数据交互式的可视化展示。实际应用结果表明，该数据中台实现了主数据定义标准与系统集成接口规范化，提高了选煤数据处理能力，实现了多源异构选煤数据融合共享、数据实时交互式的可视化展示。

Abstract: The existing coal preparation plant information management system uses nonstandard interface, which leads to repeated data collection, and each system is independent of each other, and the capability to process multi-source heterogeneous data is weak. In order to solve above problems, based on big data technology of Hadoop ecosystem, a coal preparation data center platform design scheme based on Hadoop ecosystem is proposed. The system integration is realized by defining data standards through master data management system and enterprise service bus. Normalization, correlation coefficient matrix and noise abnormal point detection programs are designed to realize data processing. DS (Dempster-Shafer) evidence theory, Hadoop and Hive data warehouse are combined to design multi-source heterogeneous data fusion subsystem to realize data fusion. Highcharts data visualization components are used to achieve interactive visualization of data. The practical application results show that the data center platform realizes the standardization of master data definition standard and system integration interface, improves the processing capability of coal preparation data, realizes the fusion and sharing of multi-source heterogeneous coal preparation data, and realizes the real-time interactive visualization of data.

HTML全文

参考文献(15)

施引文献

资源附件(0)