基于改进VisionTransformer的遥感图像分类研究

  • 打印
  • 收藏
收藏成功


打开文本图片集

中图分类号:S771.8 文献标识码:A 文章编号:2095-2953(2025)06-0031-05

Study on Remote Sensing Image Classification Based on Improved Vision Transformer

LI Zong-xuan, LENG Xin * , ZHANG Lei, CHEN Jia-kai (School of ComputerandControl Engineering,Northeast Forestry University,Harbin Heilongjiang 15Oo40,China)

Abstract:Remote sensing image classification canquickly and effciently obtain the distribution of forest areas, which supports the monitoring of forestryresource management.Vision Transformer(ViT)is widelyused in remote sensing imageclassification tasks because of its excellent ability tocapture global information.However,ViTcaptures redundantlocal features inthe process of shalow feature extractionand fails to efectively capture keyfeatures.Aditionally,this model may cause the loss of edge and otherdetail information when segmenting the image into patches, thus ffcting clasification accuracy.To address these isues,we propose an improved Vision Transformer that introduces the STA(Super Token Attention)mechanism to enhance key feature extraction and reduce computational redundancy.Haar Wavelet Downsampling (HWDS)is also incorporated to reduce detail loss while enhancing the captureof localandglobal informationat diferent scales.Experimental results demonstrate thattheproposed methodachieves an overall accuracy of 92.98% on the AID dataset,verifying its effectiveness.

KeyWords:remote sensing image clasification; Vision Transformer;Haar Wavelet Downsampling;Super Token Attention

遥感图像中包含大量地物信息,包括结构信息、纹理信息和光谱信息等。(剩余6946字)

monitor
客服机器人