小样本不平衡数据集异常双层窗口检测方法研究

  • 打印
  • 收藏
收藏成功


打开文本图片集

关键词:小样本;不平衡数据集;异常检测;双层窗口;过采样;时间序列;聚类算法;均衡化中图分类号:TN919-34;TP391 文献标识码:A 文章编号:1004-373X(2026)01-0137-04

Research on anomaly double-layered window detectionmethod for small sampleunbalanceddataset

FANG Yetong,ZHANG Lunchuan (School ofMathematics,RenminUniversityofChina,Beijing1Oo872,China)

Abstract:When there isanorder of magnitude diference between the positivesampleandthe negativesample in the database,theclassoverlapintheunbalanceddatawillmake thedeeisionboundaryofthedataoverlapped.Theuseof asingle windowpaysmoreatentiontothesimilaritystructureofdataratherthanthestratificationoftimescales,resultinginsmallr G-meanvalueofdataatdiferent timescales.Therefore,adouble-layeredwindowdetectionmethodforsmallsampleunbalanced datasetsisstudied.Theimprovedsyntheticminorityclasssampleoversampling techniqueisadoptedtocreateanewnon-repetitive minority classsample,soas torealize theequalizationofsmallsampleunbalanceddatasets.Onthebasisof takingaccountof the timescaleofthedatadouble-layeredwindowisusedtodividetheequalizedtimeseriesintomultiplesub-timeseries,theslope confidenceintervaldistanceradiusfeatureiscalculated,theabnormalsub-sequencesareidentified,andtheboraldataare identfiedfromtheabnormalsub-sequences incombinationwiththeK-meansclusteringalgorithm.Theexperimentalresultsshow thatheproposedmethodcanrealizethequalizationofunbalanceddatasetseectivelyandthattheabnormaldatadetectionof smallsampledatasetswithdiferentunbalanceratesiscompleteaccuratelyandtheG-meanvalueishigherthanO.7.Inaword, itprovidesan effective solution forabnormal data detection.

Keywords:smallsample;unbalanceddataset;anomaly detection;double-layeredwindow;oversampling;timeseries; clustering algorithm; equilibrium

0 引言

小样本不平衡数据集指的是数据集中某一类样本的数量远多于其他类样本的数量,并且整体数据集的样本量相对较少[1-2],这种现象在多种实际应用场景中普遍存在,如欺诈检测、图像分类等[3。(剩余5300字)

monitor
客服机器人