基于动态时间窗格的数据仓库流批一体优化方法

打开文本图片集
中图分类号:TP391 文献标志码:A 文章编号:1001-3695(2025)08-028-2460-07
doi:10.19734/j. issn.1001-3695.2024.11.0490
Dynamic time window-based optimization method for unified stream and batch processing in data warehouses
Chen Binlin, Tang Xiaoyong† (Scholof Computer &CommunicationEnginering,Changsha UniversityofScience & Technology,Changsha 41O114,China)
Abstract:Data warehouse is the coreof enterprise data management,and batch processing and stream procesing are he two coredata processing paradigmsforbigdataanalytics.Toaddressthehighlatencyandresourceconsumptionoftraditionalbatch processing technologies,aswellasthedataqualitychallenges faced bystreamprocessingtechnologies whenhandlingmultistreamdataassociationand historicaldatacomputation,this studyproposedaunified streamandbatch procesing method.The proposedmethodanalyzedthechangesofthedatasetindiferenttime windows,andintegratedthedynamic timewindowdivisionbasedonschedulingtimewiththesimplestdatasetsearchbasedontheDFS(depth-firstsearch)algorithm.Experimental results demonstrate that,comparedwiththe mainstreammicro-batch procesing method,this methodreduces overallcomputation time by 57.2% and memory consumption by 24.2% ,while ensuring strong data consistency. This method holds important referencevalueforenterprises inbuildingdata warehouses withhigh procesing eficiencyandlowresourceconsumption,integrating stream and batch processing.
Key words:data warehouse;data streamprocessing;dynamic time windows;minimal dataset;stream-batch integration
0引言
数据仓库是企业数据管理的核心,它帮助企业从大量复杂的数据中提取有价值的信息,提升企业竞争力和市场响应速度[1]。(剩余16932字)