融合注意力机制的改进型DeepLabv3十语义分割

打印
收藏

收藏成功

微博 QQ空间微信

打开文本图片集

Improved DeepLabv 3+ semantic segmentation incorporating attention mechanisms

YANHe，LEIQiuxia，WANG Xu

（Liangjiang College of Artificial Intelligence， Chongqing University of Technology ， Chongqing 401135，China） * Corresponding author，E-mail： yanhe@ cqut. edu. cn

Abstract：To address the chalenges of high computational complexity，limited detail extraction，and fuzzy boundaries in the current DeepLabv3+ semantic segmentation network，this study proposes an enhanced DeepLabv 3+ model incorporating attention mechanisms. Specifically，the lightweight MobileNetV2 is employed as the backbone to balance high representational capacity with a significant reduction in model parameters. A parameter-freelightweight atention mechanism（SimAM） is integrated into the lowlevel features of the backbone network to prioritize key features and enhance feature extraction capabilities. Furthermore，the global average pooling in the ASPP module is replaced with Haar Wavelet Transform Downsampling （HWD） to preserve spatial information. An External Attention Mechanism（EANet） is also introduced after the ASPP module to leverage contextual information and achieve multi-scale feature fusion，thereby improving semantic understanding and segmentation accuracy. Experimental results demonstrate that the proposed model achieves a 2.82% improvement in mean Intersection over Union（mIoU） on the VOC2Ol2 dataset compared to the original DeepLabv 3+ model. This research enhances the precision of semantic segmentation and ofers novel insights for advancing applications in computer vision.

Keywords：semantic segmentation；DeepLabv3 + ；Haar wavelet transform downsampling；External Attention（EANet）；multi-scale integration

1引言

语义分割是计算机视觉领域中的一项重要任务，旨在将图像中的每个像素分配到不同的语义类别中[1]。（剩余15242字）

试读结束

购买全文6.00元下一篇多模态语义交互的文本图像超分辨率重构

光学精密工程

2025年01期

¥60.00/本