基于不确定性估计的微调代码生成模型与大语言模型的协同方法

打开文本图片集
中图分类号:TP391 文献标志码:A 文章编号:1001-3695(2025)10-007-2947-09
doi:10.19734/j.issn.1001-3695.2025.04.0082
Coordination of fine-tuned code generation models and large language model via uncertainty estimation
Hong Shaodong 1a,1b ,Shen Guoweila,1bt,Luo Sufen²,Liu Tao² (1.a.NatioalKyLbatorofublcigat,nistrofducatiEgeinRsechCentefTetCoputing&oete ligence,Guizhou University,Guiyang55025,China;2.GuizhouProvincialJudicialPoliceAcademyGuyang5505ia)
Abstract:Thecomplementary mechanismbetween fine-tunedcode generation modelsandlargeLLMremains underexplored, leading toambiguousdecisionboundaries intheircollaboration.Thispaperproposedamethod named Coral tocoodinate finetunedmodelsandLLMsbasedonuncertaintyestimation.Thismethodanalyzedthecomplementaritybetweenthetwo models andquantified theirdecision boundaries.Coraladopted theconceptof expectedcalibrationeror tocompareuncertainty estimationmethodsandselectedastable methodfor thefine-tuned model.This enabledthefine-tuned modeltooutputuncertainty scoresreflectingpredictionconfidence.CoralcalculatedanuncertaintythresholdbymaximizingBLEUscoresonavalidation dataset,whichquantified thedecision boundarybetweenthemodels.Basedonthethresholdand theuncertaintyscores,the method identifiedIDand OODdata.TheLLMhandledOODdata to improve thegeneralizationof the fine-tuned model.Evaluationon two benchmark datasets showthat Coraloutperforms theuseof either modelalone inboth BLEUandExact Match metrics.The results indicate that Coral effectively coordinates the fine-tuned model and LLM.
tey Words:large language model;fine-tuned model;code generation;uncertainty estimatio!
0 引言
代码生成是软件工程和人工智能社区中长期存在的任务。(剩余23468字)