基于TF-IDF和GIoVe算法面向多种类别文本自动分类系统的优化研究

  • 打印
  • 收藏
收藏成功


打开文本图片集

关键词 TF-IDF算法;GloVe模型;文本自动分类;关键词位置;词性;语义扩展

分类号 G251;G258.6

DOI10.16810/j.cnki.1672-514X.2025.010.006

AbstractEficientlyorganizingandautomaticallcategorizing textthrough keywordretrievaland specifyingone or more categorylabelsisanimportantwaytodiscoverimplicitrelationshipsindocuments,promoteknowledgedissemination,and driveinnovation.However,the prominentissuesafectingthe eficientand preciseautomaticclasificationof documents are the lack of semantic information inkeywordsandthepoor accuracyof keyword recognition due to factors such as the acquisitionlocation,partfspeech,andcomprehensivenessofslectedkeywords.Inlightofthis,this paperconstructstext automatic clasification system thatcombines TF-IDF (Term Frequency-Inverse DocumentFrequency)and GloVe (Global Vectorsfor WordRepresentation).First,thesystemimproves theTF-IDFalgorithmbyconsideringpart-of-speechnfluence factorsand positionweightcoeffcients,inordertocompensate forthedeficienciesofthetraditionalTF-IDFalgorithm in keyword identificationandsemanticanalysis.Second,itfurther expands the keywordsetusing the GloVe model,chieving an accuracyandrecallateof92.6and9O.9 fortextclassification.Finally,through experimentalcomparisons,theuperrity and effectiveness of this system in handling multi-category text classification tasksare validated.

KeyWordsTF-IDFalgorithm.GloVemodel.Automatictextlasification.KeyordpositionPartof-speec.Semanticexpansion.

0 引言

通过检索关键词指定一个或多个类别标签,从而实现文本的高效组织和自动分类,是发现文档中的隐含关系并推动知识传播和创新的重要途径。(剩余9590字)

monitor
客服机器人