Abstract:Remotely captured images possess an immense scale and object appearance variability due to the complex scene. It becomes challenging to capture the underlying attributes in the global and local context for their segmentation. Existing networks struggle to capture the inherent features due to the cluttered background. To address these issues, we propose a remote sensing image segmentation network, RSSGLT, for semantic segmentation of remote sensing images. We capture the global and local features by leveraging the benefits of the transformer and convolution mechanisms. RSSGLT is an encoder–decoder design that uses multiscale features. We construct an attention map module (AMM) to generate channelwise attention scores for fusing these features. We construct a global–local transformer block (GLTB) in the decoder network to support learning robust representations during a decoding phase. Furthermore, we designed a feature refinement module (FRM) to refine the fused output of the shallow stage encoder feature and the deepest GLTB feature of the decoder. Experimental findings on the two public datasets show the effectiveness of the proposed RSSGLT.KeyWord:Context details;multiscale features;remote sensing images;semantic segmentation;transformer;
相关文献: 1.Leveraging Activation Maximization and Generative Adversarial Training to Recognize and Explain Patterns in Natural Areas in Satellite Imagery 2.On the Equivalence of LEO-SAR Constellations and Complex High-Orbit SAR Systems for the Monitoring of Large-Scale Processes 3.Regularized Constrained Total Least Squares Source Localization Using TDOA and FDOA Measurements 4.ConSeisGen: Controllable Synthetic Seismic Waveform Generation 5.MWLN: Multilevel Wavelet Learning Network for Continuous-Scale Remote-Sensing Image Super-Resolution 6.First Results of Antarctic Sea Ice Classification Using Spaceborne Dual-Frequency Scatterometer FY-3E WindRAD 7.Efficient Remote-Sensing Segmentation With Generative Adversarial Transformer 8.Cross-Modal Feature Fusion and Interaction Strategy for CNN-Transformer-Based Object Detection in Visual and Infrared Remote Sensing Imagery 9.Dual-Channel Enhanced Decoder Network for Blind Hyperspectral Unmixing 10.Weakly Supervised Learning for Pixel-Level Sea Ice Concentration Extraction Using AI4Arctic Sea Ice Challenge Dataset