大型语言模型与学生在考试中的表现比较研究

——以通义千问为例

打印
收藏

收藏成功

微博 QQ空间微信

打开文本图片集

中图分类号：TP39；G434 文献标识码：A 文章编号：2096-4706（2025）12-0050-09

Comparative Study of Large Language Models and Student Performance in Exams -Taking Qwen asan Example

LING Dalian， FENG Shiying， CHEN Sinan， PAN Weiquan （SchoolofMathematicsandStatistics，YulinNormalUniversity，Yulin537ooo，China）

Abstract： The research focuses on the application potential of Qwen，anAI chatbot driven byLLM，ineducational assessment.Basedon2190fnalexaminationquestionsof“ProbabilityandMathematical Statistics”inauniversityfrom2019 to 2023，eighteachersdouble-blindscoretheQwen Model，theoptimized modelandthestudents'answers.Theresultsshowthat the performanceofQwen isstable in multiplechoicequestions，but thereis muchroomfor improvement intheanswerquestions. EspeciallyafterPromptEngineeringoptimization，theperformanceoftheanswerquestionsissignificantlyimproved.Teachers' scoresonAI-generatedcontentaremorestringent，andthescoresaresignificantlyaffectedbythequestiontypeandtheanswer subject.ThisstudyprovidesempiricalevidenceforAI-assistededucationalassssment，emphasizingtheimportanceofupdating standards and exploring new models.

Keywords：LLM; Qwen; educational assessment; AI-assisted learning

0 引言

随着信息技术的迅猛发展，人工智能（AI）聊天机器人的应用在教育领域正逐渐普及。（剩余13820字）

试读结束

购买全文6.00元下一篇基于人工智能的生鲜存储管理系统设计与实现

现代信息科技

2025年12期

¥18.00/本