融合CTC/Attention联合解码的拉萨方言抗噪语音识别研究

  • 打印
  • 收藏
收藏成功


打开文本图片集

中图分类号:TP319 文献标识码:A

Abstract:LhasaDialectisanimportantbranchofTibetanlanguage,andconductingresearchonspeechrecogntionofLhasa Dialectholdssignificanttheoreticalandpracticalvalue.AddresingtheissuesofcarceresearchonnoisyLhasaDialectspeechand highWordErorRate(WER),thisstudyintegratesspeehenhancementalgorithmsandspeechrecognitionmodelsintoanoiserobustspechrecognitionsystemforLhasaDialectMeanwhile,itimprovestheConformerspeechecognitionmodelbyintroducing theCTC/tentionjointdecodingmechanism,formingtheConformer-CTC/Atentionspeehrecognitionmodeltofurtherreduce WER.ExperimentalresultsshowthattheconstructedConformer-CTC/AtentionspechrecognitionmodelachievesaWERof 25.46% onthepublicTibetanlanguagespeechdataset,whichisgenerallylowerthanthatof mainstreammodels.Furthermore,the WERof the constructed noise-robust speech recognition model forLhasa Dialect on noisyspeech is reduced by 8.9% compared with that of non-noise-robust speech recognition models.

Keywords:Lhasa Dialect;Speech Recognition; Conformer;Speech Enhancement

0引言

方言不仅是藏语的重要分支,也是广播电视领域的通用语言,在藏语文化传播与日常语音交互中占据核心地位。(剩余8162字)

monitor
客服机器人