基于TF-IDF和GIoVe算法面向多种类别文本自动分类系统的优化研究

打开文本图片集
关键词 TF-IDF算法;GloVe模型;文本自动分类;关键词位置;词性;语义扩展
分类号 G251;G258.6
DOI10.16810/j.cnki.1672-514X.2025.010.006
AbstractEficientlyorganizingandautomaticallcategorizing textthrough keywordretrievaland specifyingone or more categorylabelsisanimportantwaytodiscoverimplicitrelationshipsindocuments,promoteknowledgedissemination,and driveinnovation.However,the prominentissuesafectingthe eficientand preciseautomaticclasificationof documents are the lack of semantic information inkeywordsandthepoor accuracyof keyword recognition due to factors such as the acquisitionlocation,partfspeech,andcomprehensivenessofslectedkeywords.Inlightofthis,this paperconstructstext automatic clasification system thatcombines TF-IDF (Term Frequency-Inverse DocumentFrequency)and GloVe (Global Vectorsfor WordRepresentation).First,thesystemimproves theTF-IDFalgorithmbyconsideringpart-of-speechnfluence factorsand positionweightcoeffcients,inordertocompensate forthedeficienciesofthetraditionalTF-IDFalgorithm in keyword identificationandsemanticanalysis.Second,itfurther expands the keywordsetusing the GloVe model,chieving an accuracyandrecallateof92.6and9O.9 fortextclassification.Finally,through experimentalcomparisons,theuperrity and effectiveness of this system in handling multi-category text classification tasksare validated.
KeyWordsTF-IDFalgorithm.GloVemodel.Automatictextlasification.KeyordpositionPartof-speec.Semanticexpansion.
0 引言
通过检索关键词指定一个或多个类别标签,从而实现文本的高效组织和自动分类,是发现文档中的隐含关系并推动知识传播和创新的重要途径。(剩余9590字)