PMoE:在P-tuning中引入混合专家的参数高效微调框架

  • 打印
  • 收藏
收藏成功


打开文本图片集

关键词:大语言模型;参数高效微调;P-tuning;混合专家;多任务学习中图分类号:TP18 文献标志码:A 文章编号:1001-3695(2025)07-005-1956-08doi:10.19734/j.issn.1001-3695.2024.11.0484

Abstract:Large language model (LLM)has significantly improved performanceinreasoning and generation tasks.However, existing open-sourceLLMstillackssuffcientdomain-specificknowledgeandrequiresfine-tunngforspecializedtasks.Traditionalfine-tuningmethodsstruggletobalancelowcostandhigheficiencyinmuli-taskleaing.Toaddressthisisue,hispaperproposedaparameter-effcientfine-tuning framework namedPMoE.BasedontheP-tuning method,this framework introducedamixture-of-expertsmechanism toenhancemulti-task proessingwhilemaintaininglow-costtuning.Ineach Transformer modulelayer,PMoE constructed trainable expert modules toreplace the prompt modules in P-tuning and utilizedarouting mechanism todynamicallyalocatetasksbasedoninput task features.Aditionally,itdesignedtheexpert modulesinMoEto bedetachable,enabling modelreuseacrossdferent task scenariosandfurtherreducingcomputationalcosts.Experimentalresults demonstrate that PMoE achieves a 6.24% performance improvement over P-tuning on a Chinese medical dataset and exhibitssuperiorcapabilities inmulti-taskprocessngandtransferlearning,verifying itseficiencyandbroadapplicability.

Key words:large language model;parameter-effcient fine-tuning;P-tuning;mixture of experts;multi-task learning

0 引言

随着大语言模型(largelanguagemodel,LLM)的不断迭代更新,这些模型在推理和文本生成方面的能力得到了显著增强。(剩余21086字)

目录
monitor