大型语言模型与学生在考试中的表现比较研究

——以通义千问为例

  • 打印
  • 收藏
收藏成功


打开文本图片集

中图分类号:TP39;G434 文献标识码:A 文章编号:2096-4706(2025)12-0050-09

Comparative Study of Large Language Models and Student Performance in Exams -Taking Qwen asan Example

LING Dalian, FENG Shiying, CHEN Sinan, PAN Weiquan (SchoolofMathematicsandStatistics,YulinNormalUniversity,Yulin537ooo,China)

Abstract: The research focuses on the application potential of Qwen,anAI chatbot driven byLLM,ineducational assessment.Basedon2190fnalexaminationquestionsof“ProbabilityandMathematical Statistics”inauniversityfrom2019 to 2023,eighteachersdouble-blindscoretheQwen Model,theoptimized modelandthestudents'answers.Theresultsshowthat the performanceofQwen isstable in multiplechoicequestions,but thereis muchroomfor improvement intheanswerquestions. EspeciallyafterPromptEngineeringoptimization,theperformanceoftheanswerquestionsissignificantlyimproved.Teachers' scoresonAI-generatedcontentaremorestringent,andthescoresaresignificantlyaffectedbythequestiontypeandtheanswer subject.ThisstudyprovidesempiricalevidenceforAI-assistededucationalassssment,emphasizingtheimportanceofupdating standards and exploring new models.

Keywords:LLM; Qwen; educational assessment; AI-assisted learning

0 引言

随着信息技术的迅猛发展,人工智能(AI)聊天机器人的应用在教育领域正逐渐普及。(剩余13820字)

目录
monitor