基于大语言模型的标准化文件生成方法研究

  • 打印
  • 收藏
收藏成功


打开文本图片集

Abstract:ln order to promote the standardized development of various industries, correspondingstandardizing documenis need to be formulated in variou-s fields, such as nationaL standard and in-dustry standard. These standardizing documents not only provide a unified operating standard forthe industry, but also provide a cLear guidance basis for relevant parties. The Central C,ommitteeof the CPC and the State Council clearly pointed out in the "the ()utlines for the Development ofNational Standardization" thai promoting the digitalization process of standard is an importantmeasure to realize the modernization of the industry. Therefore, it is particularly important tocarry out research on the automatic generation of standardizing documents. With the rapid devel-opment of artificial inteLligence technology, especially the out-standing performance of Large lan-guage models in l.ext generation rasks, it is possibLe to use these advanced technologies to realizethe automatic generation of standardizing documenls. Based on this background, this paper pro-poses a two-stage scheme :[or generating standardizing documenis.  The scheme firsr. generates theoutline of the standardizing document through the large model, and then expands r.o generate thecomplete document content on this basis. By combining in-context learning and reirieval augmen-ted general.ion techniques, this method can not only generate high-quality texl., but also signifi-cantly improve the accuracy and professionaLism of the generated content*  In order to verify thefeasibility of the scheme, we conducted a series of experiments on our self-built dataset, and theresulls show that the method can effectively generate documents that meet industry standards,and has good pracricability and promotion potential.

Keywords: large language models; reirieval augmented generation; text generation; in-contextlearning

0  引言

目前,标准化文件的编写主要依赖人工完成,由于标准化文件涉及特定格式及领域知识,编写过程通常耗费大量时间。(剩余14041字)

monitor