基于分布式文件系统电力大数据存储实现

  • 打印
  • 收藏
收藏成功


打开文本图片集

摘要:基于Hadoop存储架构设计了电网非结构化数据管理平台,主要包括存储分析与搜索读取2大模块,整合HDFS、Hbase等存储设备,利用HDFS实现海量数据的快速读写,采用基于ZooKeeper及Solr搭建的开源分布式搜索引擎SolrCloud实现数据检索,提供了高效便捷的智能化管理手段。

关键词:电网管理;Hadoop存储;分布式;数据检索

中图分类号:TP311.13

文献标识码:B文章编号:1001-5922(2022)06-0172-04

Realization of power big data storage based on distributed file system

CHEN Xingbin WANG Zhou ZHENG Piaopiao LIN Dewei LIU Qing

(1. State Grid Fujian Electric Power Co., Ltd., Fuzhou 350000, China; 2. Information and Telecommunication Branch, State Grid Fujian Electric Power Co., Ltd., Fuzhou 350000, China; 3. State grid Xintong Yili Technology Co., Ltd., Fuzhou 350000, China

Abstract:This paper designs an unstructured data management platform for power grids based on the Hadoop storage architecture. It mainly includes two modules: storage analysis and search and reading. It integrates storage devices such as HDFS and Hbase. It uses HDFS to achieve rapid reading and writing of massive data and adopts an open-source distributed search engine SolrCloud built by ZooKeeper and Solr to implement data retrieval, providing an effective and convenient method for smart management.

Key words:power grid management; Hadoop storage; distributed; data retrieval

电力行业的非结构化数据存储的内容包括图像、视频、报表、网页等不同格式,其中70%以上源自人与人之间的协作,可以说是以人为中心产生的数据。(剩余4218字)

目录
monitor