中文文本拼写纠错研究综述

打开文本图片集
中图分类号:TP391.1;TP301.1 文献标识码:A 文章编号:2096-4706(2025)08-0138-08
Abstract:Chinese Spelling Correction(CSC)isacrucial foundational task inNaturalLanguage Processing (NLP),and providessupport forthedownstreamtasks andresearch.Theresearch in the fieldofCSCtaskscontinues to develop,mainly divided into eror corrction methods based onN-Gram language models,Deep Leaming,andLarge Language Models (LLMs). Firstly,techaracteristicsoftheN-GamlnguagemodelanditsapplicationinCSCareanalyzed,rvealingitsadvatagesin capturing contextual information.Secondly,methodsbasedonDepLearning improve theaccuracyof error coectionthrough deep neural networksand are widelyused in Chinese text procesing.Atthesame time,theriseofLLMs provides new ideas for speling correction,demonstrating their enormous potentialindealing withcomplex languagephenomena.Thisreviewprovides adetailedoverviewofthecurrentresearchstatusintheCSCfeld,providingareferenceforscholars engaged inrelatedresearch.
Keywords: Chinese text; spelling correction; N-Gram language model; Deep Learning; Large Language Model
0 引言
中文文本拼写错误(CSC)是自然语言处理(NLP)领域的一个重要的基础研究方向,其目的是检测和纠正文本中出现的拼写错误,为后续的文本分析、信息检索、文本生成等任务提供了干净、准确的输入数据。(剩余13506字)