基于层次聚类的图书元数据语义聚合研究

打开文本图片集
[关键词]图书元数据层次聚类BERT模型语义相似度语义距离
[中图分类号]G250.7
[文献标志码]A[DOI]10.19764/j.cnki.tsgjs.20240420
[本文引用格式],.基于层次聚类的图书元数据语义聚合研究[J].图书馆建设,2025(1):82-93.
[Abstract]Achievingdeepfusionofheterogeneousbookresourcesfrommultiplesourcesiscrucialforexpandingthebreadthand comprehensivenessoflbrarysericesandpromotingtedevelopmentofinteligntlibarysystems.Amongthesechalengs,tesmantic agregationofdiverseanddiferentlynamedbookmetadataplaysapivotalroleinfaclitingtheepintegrationofbookinfmationfrom varioussurstdadalntfciabts Chineseisusedasawordembedingmodel.Fromtheperspectivesofmetadatafieldsthemselvesandtributevalues,semantisimilarity anddistancebtweenmetadataareanalyzd.Itisfundtatbasedndistanematrixierarchicalusteringcanbeachievedandang correspondencebetweenmetadatacanbeautomaticllyonstructed,therebyachievingsemanticagregationbetweenbookmetadataih similar names or atributes.Experimental results demonstrate an impressive mapping relationship precision rateof 93.33% ,significantly reducing thehumanefortrequiredduringthemetadataaggegationandfusionprocess.Furthermore,theproposedsemanticaggegationappoach forbookmetadataehibitsextensivealicabityalowingitsiterativreuseinotheriformationagregationsenaroswilmintaining compatibility and generality.
[Keywords]Book metadata; Hierarchical clustering; BERTmodel; Semantic similarity; Semantic distance
0引言
图书作为知识的实际载体与传播介质,广泛分布在不同的检索发现平台,虽然通常在各个平台之间预留了访问获取链接,但由于它们在不同平台的元数据格式名称并不统一,使其在内容层面的关联整合层次较浅,呈现出了“联而不合”的局面。(剩余13769字)