基于动态时间窗格的数据仓库流批一体优化方法

  • 打印
  • 收藏
收藏成功


打开文本图片集

中图分类号:TP391 文献标志码:A 文章编号:1001-3695(2025)08-028-2460-07

doi:10.19734/j. issn.1001-3695.2024.11.0490

Dynamic time window-based optimization method for unified stream and batch processing in data warehouses

Chen Binlin, Tang Xiaoyong† (Scholof Computer &CommunicationEnginering,Changsha UniversityofScience & Technology,Changsha 41O114,China)

Abstract:Data warehouse is the coreof enterprise data management,and batch processing and stream procesing are he two coredata processing paradigmsforbigdataanalytics.Toaddressthehighlatencyandresourceconsumptionoftraditionalbatch processing technologies,aswellasthedataqualitychallenges faced bystreamprocessingtechnologies whenhandlingmultistreamdataassociationand historicaldatacomputation,this studyproposedaunified streamandbatch procesing method.The proposedmethodanalyzedthechangesofthedatasetindiferenttime windows,andintegratedthedynamic timewindowdivisionbasedonschedulingtimewiththesimplestdatasetsearchbasedontheDFS(depth-firstsearch)algorithm.Experimental results demonstrate that,comparedwiththe mainstreammicro-batch procesing method,this methodreduces overallcomputation time by 57.2% and memory consumption by 24.2% ,while ensuring strong data consistency. This method holds important referencevalueforenterprises inbuildingdata warehouses withhigh procesing eficiencyandlowresourceconsumption,integrating stream and batch processing.

Key words:data warehouse;data streamprocessing;dynamic time windows;minimal dataset;stream-batch integration

0引言

数据仓库是企业数据管理的核心,它帮助企业从大量复杂的数据中提取有价值的信息,提升企业竞争力和市场响应速度[1]。(剩余16932字)

目录
monitor