低深度测序数据的基因型填充优化与回归模型性能分析

打开文本图片集
中图分类号:S332 文献标识码:A文章编号:0439-8114(2025)07-0203-04DOI:10.14088/j.cnki.issn0439-8114.2025.07.035开放科学(资源服务)标识码(OSID):
Optimization of genotype imputation for low-depth sequencing data and performance analysis of regression models
XIANG Chong,CHEN Can
(School of Dataand Information,ChangjiangPolytechnic,Wuhan 43oo7O,China)
Abstract:Anewmethodsuitableforanalyzinglow-depth sequencing genomicdatawasestablishedbyoptimizing genotypeimputationalgorithmsandscreningoptialregressionmodels.Thesultsshowedthatcomparedtotepreoptiizationalgorit,thac racy of the optimized genotype imputation algorithm increased from 95% to 98% . Meanwhile,parameter tuning and efficient algorithm selectionreducedthesingleimputationtimefrom24hoursto12hours,significantlyimprovingprocesingeficiency.Forcotiuous phenotypicanalysis(e.g,quantitativetraitsinGWAS),theridgeregresionmodelandlinearregressionmodelperforedwllAt 1.0Xsequencingdepth,theirMSEswereO.O7andO.08,andAcuracieswereO.82and.80,respectively.Whenandlingcaiication problems(e.g.,genomicselection),eLogisticegressnodeldmonstratedsinificantadvantagesduetoitsprobabilisticodeling characteristics.This model showed good Classification performance( AUC =0.90),significantly outperforming theLinear regression model (AUC=0.85).
Keywords:low-epthsequencingdata;genotypeimputation;ridgeregresionmodels;performanceanalysis;linearregresionmod el; Logistic regression model
随着基因组学研究的不断深人,高通量测序技术已成为解析生物遗传信息的重要手段。(剩余5073字)