基于多尺度注意力机制TransUNet的双目视觉定位与测量方法

打开文本图片集
关键词:双目视觉;TransUNet;关键点检测;注意力机制
中图分类号:TP391.41 文献标识码:A
doi:10.37188/OPE.20253316.2502 CSTR:32169.14.OPE.20253316.2502
Binocular vision localization and measurement method based on TransUNet with multi-scale attention mechanism
YANG Yu,XU Sixiang,ZHANG Mengquan,WU Duanzheng (College ofMechanical Engineering,Anhui University of Technology,Ma'anshan 243032,China) * Corresponding author, E -mail: xsxhust@ahut. edu. cn
Abstract:Aiming at the problems oflow detection eficiency of traditional binocular-vision feature-detection algorithms,as well as the insuficient attention to globally important features and the excessive parameter count of most network models,a method of continuous-casting-billet localization and measurement based on TransUNet binocular vision with a multiscale attntion mechanism was proposed. Firstly,left and right images of continuous-casting billets were colected with a calibrated parallel binocular camera to build a dataset. Subsequently,taking TransUNet as the backbone,an improved Transformer layer Was introduced to extract global context information; a Global Spatial Group Attention (GSGA) module was appended after every decoder block to enhance focus on globally salient features through a grouped multiscale attention mechanism; and a Convolutional Block Attention Module (CBAM) was inserted after each encoder-decoder skip connection and bilinear interpolation to boost key-point recognition by combining spatial and channel attention.Finally,3-D coordinate reconstruction and distance measurement were performed on the network’s key-point coordinates by leveraging binocular-vision principles. The experimental results show that compared with the Transformer model,the root-meansquare error and normalized error are reduced by 33.8% and 36.83% ,the number of parameters and floating-point operations are reduced by 10.58% and 8.21% ,and the single-batch inference time is shortened by 32.30% . In 3D ranging,the relative error of measurement reaches 0.137% ,which is significantly better than the traditional feature detection algorithm and meets the binocular vision localization and measurement requirements.
Key Words:binocular vision;TransUNet;keypoints detection;attention mechanism
1引言
现代钢铁冶金行业连铸坏经火焰切割,切割面上的熔化钢液在切割口的下边缘形成不规则且硬度大的毛刺,对钢材的表面质量和运输辊道造成不利影响。(剩余17786字)