基于加权自然近邻和测地距离的密度峰值聚类算法

打开文本图片集
关键词:聚类分析;密度峰值聚类;测地距离;加权自然近邻
中图分类号:TP301.6 文献标志码:A 文章编号:1001-3695(2026)01-011-0091-11
doi:10.19734/j.issn.1001-3695.2025.06.0200
Density peaks clustering algorithm based on weighted natural nearest neighbor and geodesic distance
Wan Fang,Wei Lili†,Liu Guojun (SchoolofMathematicsand Statistics,NingxiaUniversity,Yinchuan75oo21,China)
Abstract:DPC isaneffectiveand simpleclusteringmethodthatrequiresonlyoneparameterto identifyallclustercenters withoutiterativeprocesing.However,itstilexhibitslimitationswhenhandlingcomplexdatastructures.Firstly,theclusteringresultsaresensitivetothecutoffparameter.Secondly,theEuclideandistance-based metric tends toignorethegeometric structure ofmanifolddatasets,whichaffcts theaccuracyofthe clusteringresults.Toaddress these isues,this paperproposed adensitypeaks clustering algorithm basedon weighted natural nearest neighbor and geodesic distance(DPC-WNNN-GD). This algorithmcouldanalysethelocalandglobal informationof thesamplescomprehensively,redefinethelocal densityby combing theweightednaturalnearestneighbors tobalancedensitydiferencesbetweensamples,andeliminatetheinfluenceof thecutoffparameter.Additionall,itreplacedEuclideandistancewithgeodesicdistancetobeteradapttothestructuralharacteristics ofmanifold datasets.ComparedDPC-WNNN-GDwithDPC algorithmandrelated improved algorithmsonsynthetic andreal datasets,experimental results demonstrate that DPC-WNNN-GD exhibits superior clustering performance.
Key Words:cluster analysis;density peaks clustering(DPC);geodesic distance;weighted natural nearest neighbor
0 引言
流形数据[1]作为典型的密度分布不均匀数据,其内部隐含的复杂几何特征对数据挖掘提出了挑战。(剩余19579字)