融合注意力机制的改进型DeepLabv3十语义分割

打开文本图片集
Improved DeepLabv 3+ semantic segmentation incorporating attention mechanisms
YANHe,LEIQiuxia,WANG Xu
(Liangjiang College of Artificial Intelligence, Chongqing University of Technology , Chongqing 401135,China) * Corresponding author,E-mail: yanhe@ cqut. edu. cn
Abstract:To address the chalenges of high computational complexity,limited detail extraction,and fuzzy boundaries in the current DeepLabv3+ semantic segmentation network,this study proposes an enhanced DeepLabv 3+ model incorporating attention mechanisms. Specifically,the lightweight MobileNetV2 is employed as the backbone to balance high representational capacity with a significant reduction in model parameters. A parameter-freelightweight atention mechanism(SimAM) is integrated into the lowlevel features of the backbone network to prioritize key features and enhance feature extraction capabilities. Furthermore,the global average pooling in the ASPP module is replaced with Haar Wavelet Transform Downsampling (HWD) to preserve spatial information. An External Attention Mechanism(EANet) is also introduced after the ASPP module to leverage contextual information and achieve multi-scale feature fusion,thereby improving semantic understanding and segmentation accuracy. Experimental results demonstrate that the proposed model achieves a 2.82% improvement in mean Intersection over Union(mIoU) on the VOC2Ol2 dataset compared to the original DeepLabv 3+ model. This research enhances the precision of semantic segmentation and ofers novel insights for advancing applications in computer vision.
Keywords:semantic segmentation;DeepLabv3 + ;Haar wavelet transform downsampling;External Attention(EANet) ;multi-scale integration
1引言
语义分割是计算机视觉领域中的一项重要任务,旨在将图像中的每个像素分配到不同的语义类别中[1]。(剩余15242字)