基于大模型知识蒸馏的代码摘要自动生成

打印
收藏

收藏成功

微博 QQ空间微信

打开文本图片集

引用格式：,,,等.基于大模型知识蒸馏的代码摘要自动生成[J].指挥控制与仿真,2025,47（4):27-33.YOU G,LIUWJ,etalodemmaaedrgeodelkolegedistilionJmandorol&iulatio,47 (4) :27-33.

中图分类号：TP311;TP18 文献标志码：A DOI:10.3969/j.issn.1673-3819.2025.04.005

Code summarization based on large model knowledge distillation YOU Gang，LIU Wenjie，LI Meipeng，SUN Liqun，WANG Lian， TIAN Tieku （Unit 96941 of PLA，Beijing ,China）

Abstract：Code summarization is a short natural language description ofsource code.Summariesareusuallyonlyone sentencelong，buttheyaretheprimarywayfordevelopers tounderstandcode.Recently，productsbasedonlargelanguagemodels（such as ChatGPT）havedemonstratedastrong abilitytogenerate these descriptions.However,touse these tools，programmersmustsend theircodetoanuntrustedthirdpartyforprocessing（forexample,throughAPIcalls），butthismethod is unacceptable to many organizations.This paper presents an alternative:weuse the exampleoutput generated by GPT-3.5 totrainanopensourcemodel throughaprocessrelatedtoknowledgedistilation.Enablingsmall models（with 350millon parameters） to also be comparable to GPT-3.5 in code summarization tasks.

Keywords: code summarization；large model；knowledge distillation

代码摘要是源代码的简短自然语言描述。（剩余15134字）

试读结束

购买全文6.00元下一篇一种知识图谱与大型语言模型联合的军事人力资源大数据技术研究

指挥控制与仿真

2025年04期

¥18.00/本