基于扩散模型加速和感知优化的高效姿态驱动人体动作生成技术

打开文本图片集
关键词:视频生成;扩散模型;加速技术;图像超分;视频插帧;数字人;感知质量中图分类号:TP391.41 文献标志码:A 文章编号:1001-3695(2025)10-009-2964-08doi:10.19734/j.issn.1001-3695.2025.03.0057
Efficient pose-driven human motion generation with diffusion model acceleration and perceptual optimization
Wang Jiasong,Zhou Lei†,Zhang Bo (Schoolof HealthScience&Engineering,UniversityofShanghaiforScience&Technology,Shanghai 2o93,China)
Abstract:Existing pose-conditioneddigital human videogeneration technologyfocusesonimproving thequalityof generated results,emphasizing visualrealismand motionsmoothness.However,this technologyoftenoverlookstheisseofslowgenerationspeed,liigitseiedeplontinal-teapplicais.Tddresstis,isorkpoosdausietionframework fordigital humans(DAF-DH)basedondifusionmodelacclerationand perceptualoptimization.The method tackled thehigh inferencelatencyandcomputationalcostof difusion-model-baseddigitalhumangeneration.Itadopteda thre-stageaccelerationstrategy toenhanceeficiencyandoptimizegenerationquality.Firstly,themethodoptimizedtheinferenceeffciencyofthedifusionmodelusing TensorRT.Secondly,itutilizedtheTensorRT-acelerateddiffusionmodel,combinedwithreducedinputresolutionand framesampling,toquicklygeneratelow-resolution,low-frame-rateinitialvideos. Thirdly,themethodincludedalightweight post-processing module.Thismoduleemployedsuper-resolutionandframe interpolationalgorithms toimprove thevideo’sresolutionandsmothness,enhancing thefinal generationquality.Aditionally,the method introducedasemanticfeatureconsistencylossfunctiontoimprovesubjectivevisual perception.Thiswork alsoconstructeda DH-Motion datasetcontaining1705 motionsequences to provideabenchmark forresearch.Experiments show that thisframeworkachievesa5xspedupcomparedtoMimicMotion.Thegenerationqualityisimproved:theLPIPSmetricdecreasesby0.O33,andtheFVDscorereducesby82.9.TheseresultsdemonstratethatDAF-DHefectivelyreduces inferencelatency,enhances generation quality,and issuitable for real-time digital human video generation applications.
Key words:video generation;difusion model;acceleration technique;image super-resolution;video frame interpolation; digital human;perceptual quality
0 引言
近年来,随着虚拟现实、增强现实以及人工智能技术的快速发展,基于扩散模型的数字人生成技术在多个领域得到了广泛应用[1,2]。(剩余20756字)