![]() ![]() Therefore, we propose a global style and local matching contrastive learning network (GLCNet) for RSI semantic segmentation. However, most existing contrastive learning methods are designed for classification tasks to obtain an image-level representation, which may be suboptimal for semantic segmentation tasks requiring pixel-level discrimination. Contrastive learning is a typical method of SSL that can learn general invariant features. A new learning paradigm, self-supervised learning (SSL), can be used to solve such problems by pretraining a general model with a large number of unlabeled images and then fine-tuning it on a downstream task with very few labeled samples. However, supervised learning for semantic segmentation requires a large number of labeled samples, which is difficult to obtain in the field of remote sensing. Recently, supervised deep learning has achieved a great success in remote sensing image (RSI) semantic segmentation. It greatly improves the detection performance of aircraft, and offers an effective approach to merge SAR domain knowledge with deep learning techniques. The geospatial transformer integrates deep learning with SAR target characteristics to fully capture the multi-scale contextual information and geospatial information of aircraft, effectively reduces complex background interference and tackles the position difference of targets. ![]() The results indicate the detection performance of our geospatial transformer is better than Faster R-CNN, SSD, Efficientdet-D0 and YOLOV5s. In the experiment, four large-scale SAR images with 1m resolution from Gaofen-3 system are tested which are not included in the dataset. Two innovative geospatial attention modules are proposed within MGCAN, namely the Efficient Pyramid Convolution Attention Fusion (EPCAF) module and Parallel Residual Spatial Attention (PRSA) module, to extract multi-scale features of the aircraft and suppress background noise. Finally, the detection results are produced via recomposition. Second, slices are input into the MGCAN network for feature extraction and the Cluster Distance Non-Maximum Suppression (CD-NMS) is utilized to determine bounding boxes of aircraft. First, the given large-scale SAR image is decomposed into slices via sliding windows according to image characteristics of aircraft. To solve these problems, we propose the geospatial transformer framework and implement it as a three-step target detection neural network, namely the image decomposition, Multi-scale Geospatial Contextual Attention Network (MGCAN) and result recomposition. Although deep learning techniques have achieved noticeable success in aircraft detection, the scale heterogeneity, position difference, complex background interference and speckle noise, keep aircraft detection in large-scale SAR images challenging.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |