基于多扰动策略的中文对抗样本生成方法

打开文本图片集
Chinese adversarial example generation method based on multi-disturbance strategy
Wang Chundong1,²,Zhu Wenying1,²,Lin Hao1,2(1.Scholofuee&inUesitna;NlEforComputer VirusPrevention&Control Technology,Tianjin 3Oo384,China)
Abstract:Toaddress thevulnerabilityofdeepneuralnetworkstoadversarialsamplesandthelackofhigh-qualityadversarial samples inthe Chinese context,the method introduced a new Chinese adversarial sample generation method named CMDS.In thekeywordselectionstage,theScore functionusedidentifiespositions whereperturbationscouldbeadded efectively,nsuring theadversarialsamples werebothreadableanddificulttodetect.Duringtheadversarialsamplegeneration phase,themethod fullexploited characteristicsunique to Chinese,consideringaspectssuchas character shape,meaning,andregion-specific homophones.Variousperturbationstrategies,includingsimilarcharacters,syonyms,homophones,andwordoderdisuption were employedalongwitha multi-priorityperturbation strategy to generateadversarial samples.Finally,aperturbationrate thresholdcontroledteoutput,eliminatingsamplesthatdiferedtoogreatlyfromteriginaltext.Folowingthis,asrsofexperimentscompared CMDS with baselinemethods to exploretheimpact of perturbation threshold sizes,involved humanevaluations,andconductedreal-worldattack tests.These experimentsconfimtheefectivenessandtransferabilityofCMDSinenhancing model securityResultsshowthat CMDS surpassesbaseline methods in terms of attck successratebyupto36.9 percentagepointsandimproves modelsecuritybymore than3Opercentagepoints.The generatedadversarialsamplesareofhighquality and demonstrate strong generalizability.
Key words:deep neural network;natural language procesing(NLP);Chineseadversarial example;multi-disturbance
0引言
近些年来,人工智能技术快速发展使其在计算机视觉[1]自然语言处理[2]、数据挖掘[3]、机器翻译[4]等领域有着重要的研究和应用,但深度神经网络的可解释性相对较差[5],难以解释其最终的输出结果。(剩余18252字)